: : I need to program a routine that will receive two strings and return % (from 0% to 100%) of likeness between the two strings.
: : For example:
: :
: : I compare these two strings:
: : Hello Jhon
: : Hello Jon
: : And the routine tells me that theres a 90% of likeness between the two of them.
: :
: : I dont have any idea where to start, but i gues this is what goggle does when you search for something and it tells you if you meant something else.
: :
: : Does anyone know where can i start or send me a link with some tutorial? I have spent days searching with no use.
: :
: : Thanks in advance!
: :
:
: My question is what does the percentage really mean...??? In your example of comparing
:
: Hello Jhon
: Hello Jon
:
: They match 9 letters... They match 9 out of 10 letters in the first string (90%), but match 9 out of 9 in the second one (100%)... If you average them, that's 95%... So you really need to have a solid meaning of the percentage, because there could be many possibilities... Also, how do you define the percentage for this example???:
:
: Hello Jhon
: Hello John
:
: They match 100% in the letters they contain, but one (and esp. strcmp() ) would definitely not say that they are matching strings... Maybe 70% since the first 7 letters match, but then this goes back to the first problem if the strings are different lengths...
:
: Tough problem... Good luck... Like I said, once you have a good meaning defined for the percentage, it will be easier to figure out what to do...
:
Good point. The code should be able to tell differences likes extra spaces, added symbols or any lack of symbol. This is getting though, ive been investigating and its harder than i thought; there are many algorithms that have been created before to solve this, but they are not very simple to understand though.