Developing Algorithm For Matching Arabic Names Entered by Mobile Phone
- Arabic Language,
- Name Matching,
- Levenshtein Distance,
- Mobile Phone,
- Phone Keyboard Arabic
Copyright (c) 2019 International Journal For Research In Advanced Computer Science And Engineering (ISSN: 2208-2107)
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Name matching plays a vital and crucial role in many applications. They are for example used in information retrieval or deduplication systems to do comparisons among names to match them together or to find the names that refer to identical objects, persons, or companies. Since names in each application are subject to variations and errors that are unavoidable in any system and because of the importance of name matching, so far many algorithms have been developed to handle matching of names. These algorithms consider the name variations that may happen because of spelling, pattern or phonetic modifications. However most existing methods were developed for use with the English language and so cover the characteristics of this language. Up to now no specific one has been designed and implemented for the Arabic language. The purpose of this study is to present a name matching algorithm for Arabic language. In this project, after consideration of all major algorithms in this area, we selected one of the basic methods for name matching that we then expanded to make it work particularly well for Arabic names. This proposed new algorithms based on the convergence and spacing between the Arabic characters in the keyboard of the mobile phone in order to give more accurate results for Arabic names. In this study the experiments have been
accomplished in order to evaluate the proposed algorithm (LD_F,LD_S and LD_KE). The first experiment has been
carried for the proposed algorithms (LD_F,LD_S,LD_KM and LD_KE). This experiment is carried based on F-Dataset which has 15 pairs of names. The result of the experiment showed that the proposed algorithms gave more accurate results than the Levenshtein algorithm. Therefore, it can be used in many applications such as Automatic Spell Correction (ASC), Search Engines (SE), Data Retrieval (DR), Computational Biology “DNA” ,Customer Relation
Management (CRM), Customer Data Integration (CDI), AntiMoney Laundering (AML) and Criminal Investigation (CI)..
- Ahmed,k,Elmagarimd (2007). Duplication Record Detection: A survey . IEEE transactionon knowledge and
- dataenginering, , 19(1).
- P. A. Hall and G. R. Dowling, “Approximate string matching. ACM Computing Surveys (CSUR),” ACM
- Computing Surveys (CSUR), vol. 12, no. 4, pp. 381–402, 1980.
- T. El-Shishtawy, “Linking Databases using Matched Arabic Names,” Computational Linguistics and Chinese
- Language Processing, vol. 19, no. 1, pp. 33–54, 2014.
- Branting, L. K. (2003). A comparative evaluation of name-matching algorithms. ICAIL '03 Proceeding of the
- th international conference on Artificial intelligence and law (pp. 224-232). New York: ACM.
- M.S. Waterman, T.F. Smith, and W.A. Beyer, “Some Biological Sequence Metrics,” Advances in Math., vol. 20,
- no. 4, pp. 367-387, 1976.
- Smith, T. F., & Waterman, M. S. (1981). Identification of Common Molecular Subsequences. Journal of
- Molecular Biology , 147, 195-197.
- Jaro, M. (1989). Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of
- Tampa, Florida. Journal of the American statistical Association , 89, 414-420.
- Table 5. The Average similarity of LD, LD_S, LD_F, LD_KE
- and D_KM algorithmsDeveloping Algorithm For Matching Arabic Names Entered by Mobile Phone 3
- F. Ahmed and A. N¨urnberger, “N-grams Conflation Approach for Arabic,” in ACM SIGIR Conference,
- Amsterdam, 2007.
- M. Bilenko, R. Mooney, W. Cohen, P. Ravikumar, and S. Fienberg, “Adaptive name matching in information
- integration,” IEEE Intelligent Systems, vol. 18, no. 5, pp. 16–23, 2003.
- W. W. Cohen, “Integration of heterogeneous databases without common domains using queries based
- on textual similarity,” in ACM SIGMOD Record, vol. 27,
- pp. 201–212, ACM, 1998.
- S. U. Aqeel, S. Beitzel, E. Jensen, D. Grossman, and O. Frieder, “On the Development of Name Search Techniques for Arabic,” Journal of the American Society for Information Science and Technology, vol. 57, no. 6, pp. 728–739, 2006.
- H. A. Shedeed and H. Abdel, “A New Intelligent Methodology For Computer Based Assessment Of Short
- Answer Question Based On A New Enhanced Soundex Phonetic Algorithm For Arabic Language,” International
- Journal of Computer Applications, vol. 34, no. 10, 2011.
- “Understanding classic soundex algorithms,”http://www.creativyst.com/Doc/Articles/SoundExl/SoundExl.htm.
- L. Philips, “Hanging on the Metaphone,” Computer Language, vol. 7, no. 12, pp. 39–44, 1990. Levenshtein, V.
- I. (1966, February). Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics
- doklady (Vol. 10, No. 8, pp. 707-710).
- H. H. A. Ghafour, A. El-Bastawissy, and A. F. A. Heggazy, “AEDA: Arabic Edit Distance Algorithm
- Towards A New Approach for Arabic Name Matching,” in IEEE International Conference, IEEE Trans. Pattern
- Analysis and Machine Intelligence, vol. 15, pp. 926–932,2011.
- Al-Sanabani, M., & Al-Hagree, S. (2015) Improved An Algorithm For Arabic Name Matching. Open
- Transactions On Information Processing ISSN (Print): 2374–3786 ISSN (Online): 2374–3778.
- Alsurori, M., Al-Sanabani, M., & Salah, A. H. (2018). Design an Accurate Algorithm for Alias Detection,
- ISSN: 2074-9023 (Print), ISSN: 2074-9031 (Online).
- Gueddah, H., & Yousfi, A. (2013, May). The impact of arabic inter-character proximity and similarity on spellchecking. In Intelligent Systems: Theories and Applications (SITA), 2013 8th International Conference
- on (pp. 1-4). IEEE.
- Salah, A. H ,&Al-Sanabani, M.. (2016). A Framework For Name Matching In Arabic Language, 1st
- Scientific Conference on Information Technology and Networks.
- Hamza, B., Abdellah, Y., Hicham, G., & Mostafa, B. (2014). For an independent spell-checking system from the
- Arabic language vocabulary. International Journal of Advanced Computer Science and Applications.
- Aljameel, S. S., O'Shea, J. D., Crockett, K. A., & Latham, A. (2016, December). Survey of string similarity
- approaches and the challenging faced by the Arabic language. In Computer Engineering & Systems (ICCES),
- 11th International Conference on (pp. 241-247).IEEE.
- Lhoussain, A. S., Hicham, G. U. E. D. D. A. H., & Abdellah, Y. O. U. S. F. I. (2015). Adaptating the
- levenshtein distance to contextual spelling correction. International Journal Of Computer Science And
- Application.(12), 1, 127-133.
- Hicham, G. (2012). Introduction of the weight edition errors in the Levenshtein distance. arXiv preprint
- Mohammed, N., & Abdellah, Y. (2018). The vocabulary and the morphology in spell checker. Procedia
- Computer Science, 127, 76-81.
- Beernaerts, Jasper., Debever, E., Lenoir, M., De Baets, B., & Van de Weghe, N. (2019). A method based on
- the Levenshtein distance metric for the comparison of multiple movement patterns described by matrix sequences of different length. Expert Systems with Applications, 115, 373-385.
- Rani, S., & Singh, J. (2017, October). Enhancing Levenshtein‟s Edit Distance Algorithm for Evaluating
- Document Similarity. In International Conference on Computing, Analytics and Networks(pp. 72-80). Springer,
- Ichimura, T., & Kamada, S. (2013, October). A Clonal Selection Algorithm with Levenshtein Distance
- based Image Similarity in Multidimensional Subjective Tourist Information and Discovery of Cryptic Spots by
- Interactive GHSOM. In Systems, Man, and Cybernetics (SMC), 2013 IEEE International Conference on (pp. 2085-
- . IEEE.
- Halim, D., & Hansun, S. (2018). Voice Control in Calorie Tracker Application using Levenshtein Distance
- Algorithm. Aceh International Journal of Science and Technology, 7(1), 1-10.
- Nurhayati., & Busman . (2017). Development of document plagiarism detection on Android smartphone.
- IEEE . https://ieeexplore.ieee.org/document/8089249 .
- Wakil, K., Ghafoor, M., Abdulrahman, M., & Tariq,
- S. (2017). Plagiarism Detection System for the Kurdish.
- Lodhi, A., Razzaq, S., & Gull, M. Detecting Urdu
- Text Plagiarism Using Similarity Matching Techniques.