Mostrar el registro sencillo del ítem

dc.contributor.authorElarnaoty, Mohammed
dc.contributor.authorServant-Cortés, Francisco Javier 
dc.date.accessioned2024-12-12T09:52:46Z
dc.date.available2024-12-12T09:52:46Z
dc.date.issued2024
dc.identifier.citationMohammed El Arnaoty, Francisco Servant, OneSpace: Detecting cross-language clones by learning a common embedding space, Journal of Systems and Software, Volume 208, 2024, 111911, ISSN 0164-1212, DOI: https://doi.org/10.1016/j.jss.2023.111911es_ES
dc.identifier.urihttps://hdl.handle.net/10630/35608
dc.description.abstractIdentifying clone code fragments across different languages can enhance the productivity of software developers in several ways. However, the clone detection task is often studied in the context of a single language and less explored for code snippets spanning different languages. In this paper, we present OneSpace, a new cross-language clone detection approach. OneSpace projects different programming languages to the same embedding space using both code and API data. OneSpace, hence, leverages a Siamese Network to infer the similarity of the embedded programs. We evaluate OneSpace by detecting clones across three language pairs; JAVA-Python, Java-C++ and Java-C. We compared OneSpace with the other state-of-art techniques, SupLearn and CLCDSA. In our evaluation, OneSpace provided higher effectiveness than the state of the art. Our ablation study validated some of our intuitions in designing OneSpace, particularly that using a single embedding space (as opposed to separate ones) provides higher effectiveness. Additionally, we designed a variant of OneSpace that uses Word-Mover-Distance Algorithm and provides lower effectiveness, but is much more efficient. We also found that OneSpace provides higher effectiveness than the state of the art, even for: complex implementations, single-method implementations, varying ratios of positive to negative clones in training, varying amounts of training data, and for additional programming languages.es_ES
dc.description.sponsorshipNSF CCF-2046403, URJC C01INVESDIST, AEI PID2022-142964OA-I00.es_ES
dc.language.isoenges_ES
dc.publisherElsevieres_ES
dc.rightsinfo:eu-repo/semantics/openAccesses_ES
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectSoftware - Diseñoes_ES
dc.subject.otherClone detectiones_ES
dc.subject.otherSoftware engineeringes_ES
dc.subject.otherMachine learninges_ES
dc.subject.otherCode embeddinges_ES
dc.subject.otherSiamese neural networkses_ES
dc.titleOneSpace: Detecting cross-language clones by learning a common embedding spacees_ES
dc.typeinfo:eu-repo/semantics/articlees_ES
dc.identifier.doi10.1016/j.jss.2023.111911
dc.rights.ccAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.type.hasVersioninfo:eu-repo/semantics/acceptedVersiones_ES


Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Excepto si se señala otra cosa, la licencia del ítem se describe como Attribution-NonCommercial-NoDerivatives 4.0 Internacional