Self-admitted technical debt in R: detection and causes

dc.contributor.authorSharma, Rishab
dc.contributor.authorShahbazi, Ramin
dc.contributor.authorFard, Fatemeh H.
dc.contributor.authorCodabux, Zadia
dc.contributor.authorVidoni, Melina
dc.date.accessioned2023-12-04T00:36:28Z
dc.date.available2023-12-04T00:36:28Z
dc.date.issued2022-08-25
dc.date.updated2022-08-28T10:05:37Z
dc.description.abstractSelf-Admitted Technical Debt (SATD) is primarily studied in Object-Oriented (OO) languages and traditionally commercial software. However, scientifc software coded in dynamically-typed languages such as R difers in paradigm, and the source code comments’ semantics are diferent (i.e., more aligned with algorithms and statistics when compared to traditional software). Additionally, many Software Engineering topics are understudied in scientifc software development, with SATD detection remaining a challenge for this domain. This gap adds complexity since prior works determined SATD in scientifc software does not adjust to many of the keywords identifed for OO SATD, possibly hindering its automated detection. Therefore, we investigated how classifcation models (traditional machine learning, deep neural networks, and deep neural Pre-Trained Language Models (PTMs)) automatically detect SATD in R packages. This study aims to study the capabilities of these models to classify diferent TD types in this domain and manually analyze the causes of each in a representative sample. Our results show that PTMs (i.e., RoBERTa) outperform other models and work well when the number of comments labelled as a particular SATD type has low occurrences. We also found that some SATD types are more challenging to detect. We manually identifed sixteen causes, including eight new causes detected by our study. The most common cause was failure to remember, in agreement with previous studies. These fndings will help the R package authors automatically identify SATD in their source code and improve their code quality. In the future, checklists for R developers can also be developed by scientifc communities such as rOpenSci to guarantee a higher quality of packages before submissionen_AU
dc.description.sponsorshipThis study is partly supported by the Natural Sciences and Engineering Research Council of Canada, RGPIN-2021-04232 and DGECR-2021-00283 at the University of Saskatchewan, and RGPIN-2019-05175 at the University of British Columbia. Open Access funding enabled and organized by CAUL and its Member Institutions.en_AU
dc.format.mimetypeapplication/pdfen_AU
dc.identifier.issn1573-7535en_AU
dc.identifier.urihttp://hdl.handle.net/1885/307635
dc.language.isoen_AUen_AU
dc.provenanceThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licen ses/by/4.0/.en_AU
dc.publisherSpringer USen_AU
dc.rights© The Author(s) 2022en_AU
dc.rights.licenseCreative Commons Attribution 4.0 International Licenseen_AU
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_AU
dc.sourceAutomated Software Engineeringen_AU
dc.subjectSelf-admitted technical debten_AU
dc.subjectR packagesen_AU
dc.subjectMachine learningen_AU
dc.subjectDeep learningen_AU
dc.subjectDeep neural pre-trained language modelsen_AU
dc.titleSelf-admitted technical debt in R: detection and causesen_AU
dc.typeJournal articleen_AU
dcterms.accessRightsOpen Accessen_AU
local.bibliographicCitation.issue2en_AU
local.bibliographicCitation.lastpage41en_AU
local.bibliographicCitation.startpage1en_AU
local.contributor.affiliationVidoni, Melina, CECS School of Computing, The Australian National Universityen_AU
local.description.notesImported from Springer Natureen_AU
local.identifier.ariespublicationu1118090xPUB9
local.identifier.citationvolume29en_AU
local.identifier.doi10.1007/s10515-022-00358-6en_AU
local.publisher.urlhttps://link.springer.com/en_AU
local.type.statusPublished Versionen_AU

Downloads

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
s10515-022-00358-6.pdf
Size:
1.28 MB
Format:
Adobe Portable Document Format
Description: