Code that compares two strings and returns % of likeness

I need to program a routine that will receive two strings and return % (from 0% to 100%) of likeness between the two strings.
For example:

I compare these two strings:
Hello Jhon
Hello Jon
And the routine tells me that theres a 90% of likeness between the two of them.

I dont have any idea where to start, but i gues this is what goggle does when you search for something and it tells you if you meant something else.

Does anyone know where can i start or send me a link with some tutorial? I have spent days searching with no use.

Thanks in advance!


Comments

  • : I need to program a routine that will receive two strings and return % (from 0% to 100%) of likeness between the two strings.
    : For example:
    :
    : I compare these two strings:
    : Hello Jhon
    : Hello Jon
    : And the routine tells me that theres a 90% of likeness between the two of them.
    :
    : I dont have any idea where to start, but i gues this is what goggle does when you search for something and it tells you if you meant something else.
    :
    : Does anyone know where can i start or send me a link with some tutorial? I have spent days searching with no use.
    :
    : Thanks in advance!
    :
    :
    :


    Start to compare length of the strings with strlen. If they aren't equal, loop through the shortest of them and check every character with the corresponding character in the other string.

    You have to decide what to do if they aren't equal: will you skip that char, decrease the likeness % and go to the next. Or will you do something else?

    You will have to think of things like this:

    Hello Jhon
    Hello Jon

    or

    Hello Jhon
    Hello J on

    Are the two examples as much alike? Maybe you will have to check through the strings several times with different algorithms. Are they not equal if they are of different upper/lower case? And so on...
  • : : I need to program a routine that will receive two strings and return % (from 0% to 100%) of likeness between the two strings.
    : : For example:
    : :
    : : I compare these two strings:
    : : Hello Jhon
    : : Hello Jon
    : : And the routine tells me that theres a 90% of likeness between the two of them.
    : :
    : : I dont have any idea where to start, but i gues this is what goggle does when you search for something and it tells you if you meant something else.
    : :
    : : Does anyone know where can i start or send me a link with some tutorial? I have spent days searching with no use.
    : :
    : : Thanks in advance!
    : :
    : :
    : :
    :
    :
    : Start to compare length of the strings with strlen. If they aren't equal, loop through the shortest of them and check every character with the corresponding character in the other string.
    :
    : You have to decide what to do if they aren't equal: will you skip that char, decrease the likeness % and go to the next. Or will you do something else?
    :
    : You will have to think of things like this:
    :
    : Hello Jhon
    : Hello Jon
    :
    : or
    :
    : Hello Jhon
    : Hello J on
    :
    : Are the two examples as much alike? Maybe you will have to check through the strings several times with different algorithms. Are they not equal if they are of different upper/lower case? And so on...
    :


    Thank you, im gonna try
  • : I need to program a routine that will receive two strings and return % (from 0% to 100%) of likeness between the two strings.
    : For example:
    :
    : I compare these two strings:
    : Hello Jhon
    : Hello Jon
    : And the routine tells me that theres a 90% of likeness between the two of them.
    :
    : I dont have any idea where to start, but i gues this is what goggle does when you search for something and it tells you if you meant something else.
    :
    : Does anyone know where can i start or send me a link with some tutorial? I have spent days searching with no use.
    :
    : Thanks in advance!
    :

    My question is what does the percentage really mean...??? In your example of comparing

    Hello Jhon
    Hello Jon

    They match 9 letters... They match 9 out of 10 letters in the first string (90%), but match 9 out of 9 in the second one (100%)... If you average them, that's 95%... So you really need to have a solid meaning of the percentage, because there could be many possibilities... Also, how do you define the percentage for this example???:

    Hello Jhon
    Hello John

    They match 100% in the letters they contain, but one (and esp. strcmp() ) would definitely not say that they are matching strings... Maybe 70% since the first 7 letters match, but then this goes back to the first problem if the strings are different lengths...

    Tough problem... Good luck... Like I said, once you have a good meaning defined for the percentage, it will be easier to figure out what to do...
  • : : I need to program a routine that will receive two strings and return % (from 0% to 100%) of likeness between the two strings.
    : : For example:
    : :
    : : I compare these two strings:
    : : Hello Jhon
    : : Hello Jon
    : : And the routine tells me that theres a 90% of likeness between the two of them.
    : :
    : : I dont have any idea where to start, but i gues this is what goggle does when you search for something and it tells you if you meant something else.
    : :
    : : Does anyone know where can i start or send me a link with some tutorial? I have spent days searching with no use.
    : :
    : : Thanks in advance!
    : :
    :
    : My question is what does the percentage really mean...??? In your example of comparing
    :
    : Hello Jhon
    : Hello Jon
    :
    : They match 9 letters... They match 9 out of 10 letters in the first string (90%), but match 9 out of 9 in the second one (100%)... If you average them, that's 95%... So you really need to have a solid meaning of the percentage, because there could be many possibilities... Also, how do you define the percentage for this example???:
    :
    : Hello Jhon
    : Hello John
    :
    : They match 100% in the letters they contain, but one (and esp. strcmp() ) would definitely not say that they are matching strings... Maybe 70% since the first 7 letters match, but then this goes back to the first problem if the strings are different lengths...
    :
    : Tough problem... Good luck... Like I said, once you have a good meaning defined for the percentage, it will be easier to figure out what to do...
    :

    Good point. The code should be able to tell differences likes extra spaces, added symbols or any lack of symbol. This is getting though, ive been investigating and its harder than i thought; there are many algorithms that have been created before to solve this, but they are not very simple to understand though.
  • Like everyone has been saying, it really depends on what kind of comparison you wish to do. If you want to just compare the occurrance of letters, you could just use an array of chars, if you want to compare the arrangements you'd need to loop through each to search for similar letters. What exactly do you want to get as far as similarities are concerned.
Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Categories