Talk:Daitch–Mokotoff Soundex
This article has not yet been rated on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||
|
DM Codes
[edit]D-M can return up to 32 possible distinct codes (not just two!)... Your samples are a little off as well:
Auerbach ==> 097500,097400
Peters ==> 739400,734000
Peterson ==> 739460,734600
Uhrbach ==> 097500,097400
And check out some of these:
Jackson ==> 154600,454600,145460,445460
(compound name) Jackson-Johnson ==> 154664,454664,145466,445466,154646,454646,145464,445464
Every time you encounter certain letter combinations, your results effectively double. CK is one example (rule for CK is "Try K (5) and TSK(4)", doubling your results). Since there is a max of 6 digits per result, you can theoretically have up to 2 ^ (6 - 1) = 32 possible results. If you want to check your results, look up the http://www.jewishgen.org/jos/jossound.htm website. According to the creator of D-M Soundex, Mr. Mokotoff, this calculator is the "official" implementation of the algorithm. There is also a SQL Server implementation based on the ruleset at http://www.avotaynu.com/soundex.html; it is found at http://www.sqlservercentral.com/columnists/mcoles/sql2000dbatoolkitpart3.asp.
Enjoy.
- Fixed.
Inventors
[edit]BTW, D-M Soundex is often referred to as "Jewish Soundex" or "Eastern European Soundex", although the authors discourage use of those nicknames. The "official name", according to the creators is the "Daitch-Mokotoff Soundex Algorithm" (or D-M Soundex). It is the official searching algorithm for the Holocaust Museum and for the Ellis Island Database Project. It was invented BY Gary Mokotoff, and later IMPROVED BY Randy Daitch one year later; the article implies that it was co-invented by both genealogists together back in 1985.
- Fixed
Beider-Morse Phonetic Name Matching - New Page?
[edit]With the new link that contains detailed information on the BMPM algorithm, should this be spun off to its own page? KosherJava (talk) 03:35, 15 February 2009 (UTC)