With more than 1,200 publications over the past two decades, experimental mobile-assisted language learning (MALL) studies targeting second/foreign language (L2) acquisition outcomes are certainly not lacking in quantity. Their research quality, on the other hand, has often been brought into question, most notably with regard to the adequacy of their assessment instruments and statistical analyses. Yet limiting the determination of research quality to the evaluation of testing procedures, and the statistical analysis of the results they produce, ignores the critical relevance of the underlying research parameters that generate the results in the first place. A comprehensive evaluation of quantitative experimental L2 acquisition MALL research quality, encompassing design as well as assessment instruments and statistical analysis, thus remains to be undertaken. The present investigation endeavors to do so based on an extensive compilation of 737 MALL studies published between 2000 and 2021. The research quality of these publications is evaluated according to four main parameters: language acquisition moderators, treatment intervention conditions, assessment instruments, and statistical analysis. These are applied according to a modified version of the Checklist for the Rigor of Education-Experiment Designs (CREED), which classifies research design quality into five levels: low, medium-low, medium, medium-high, high. With over three quarters of all studies falling within the low category, the result leaves much to be desired. Since the modified CREED algorithm developed here can equally be applied to studies from their inception, it offers a way forward to improve the research quality of future experimental MALL studies.