This workshop is co-located with MODELS, in Munich. It will be a discussion-based event, calling for extended abstracts to kickstart such discussions. We will host a keynote by Lionel Briand (University of Luxembourg) about empirical evaluation.
In software engineering (SE), the intrinsic softness of the element under study often makes the evaluation process fuzzy. Moreover, Model-Driven Engineering (MDE) faces specific challenges which are not so profound in other areas of software engineering. For example, we do not have access to loads of data for empirical evaluation like in other fields such as data/process mining where there is plenty of data from code repositories such as Github for example. In this workshop, we would like to address the topic of evaluating contributions in software engineering in general, and particularly its relationship with MDE. This workshop will focus on discussions, where experts in software engineering will share examples of good and bad evaluations based on their own experience. The objective of this discussion is to identify how to properly evaluate a research contribution in software engineering, with a particular focus on MDE. As a result, we expect to sketch a model of evaluation, which will be described in a workshop report.
The very idea of this workshop was born on Twitter, starting with Loli Burgueño asking her twitter audience about the difficulty to get a paper accepted without being attacked on its validation/evaluation. Her message was retweeted (among others) by the Spanish Scientific Society of Computer Science (SCIE) and the Spanish Society of Software engineering and software development technologies (SISTEDES).
As software engineering researchers, we are all confronted with the evaluation of our results. How does this new result improves the state of the art? In MDE, which suffers from an image of a heavyweight approach, it is even more crucial to evaluate the benefits of the described results and measure its cost. However, the evaluation of such results is often identified as the weakest point of a paper during the review process. As notions of acceptable evaluation are not rigorously defined in the MDE community, it is left to reviewer subjectivity to accept the methodology described to evaluate the quality of the researchers’ work. On the other hand, since there are not methodologies or even guidelines about how authors should evaluate their contributions or what is expected from an evaluation, they might have doubts about how to properly do this part of the work. For example, at a very high-level, do researchers have to talk about performance, completeness, expressiveness, usefulness, etc.? All of them? Only some of them?
There is always ambiguity, subjectivity and imprecision about what makes acceptable evaluation in SE/MDE research. It is not clear what both authors and reviewers expect of a evaluation section. More generally, and taking apart the publication process, what does it mean to evaluate a result? When does a researcher can be confident enough with her evaluation process to defend her contribution? How to defuse such (bad) arguments with facts? What are these facts? How to use such facts to convince industrial partners to adopt the evaluated results?
Our objective is to bring together, in an unique forum, people from different software engineering communities to exchange ideas and experiences on evaluation practices in MDE. Based on this diverse audience, we expect to publish an open-access workshop report that will bring together the different points of view, hints and pieces of advice that one can expect when confronted to the evaluation of an MDE contribution. We expect from the participants of the workshop contributions of different kinds, including, but not limited to:
The workshop will not follow a classical paper-based approach. We defend that it is necessary to lower the entry barrier to gather a large audience a obtain fruitful discussions.
So, we will only call for extended abstract of up to two pages (LNCS format). This very short format will help to present the participant’s point of view and will serve as a starting point for discussions. We will particularly look forward to examples of validation encountered by the participants, either good or bad.