To search, Click below search items.


All Published Papers Search Service


Apply clustering to analyze categorical data in longitudinal studies


Mohammad Mahdi Hassan, Martin Blom, Gufran Ahmad Ansari


Vol. 19  No. 4  pp. 10-19


It is common to collect data from practitioners in the software engineering field using surveys and questionnaires. This data is usually analyzed using descriptive statistics where the entire population is considered as an undivided group, sometimes complemented by sampling methods to obtain variations within the sample. In many cases, the survey population is partitioned into smaller groups by using available background knowledge of the participants. These techniques are valid, but can only reveal opinion diversity if that correlates with the background variables, and fail to identify sub-groups across multiple background variables. The existing approaches can thus capture the general trends but might miss opinions of different minority sub-groups. This problem becomes more complex in longitudinal studies where minority opinions might fade or resolute over time. Data from longitudinal studies may contain patterns which can be extracted using a clustering process. These patterns may unveil supplementary information and draw attention to alternative viewpoints than those exhibited by the sample population as a whole. This approach may reveal the range of opinion variations between diverse groups over time and makes it possible to identify the minorities. In our research, we have investigated the suitability of clustering techniques for analyzing categorical data from longitudinal studies.


Empirical Survey, Longitudinal Study, Clustering, Partitioning, Grouping, Data Mining, Expert Opinion, Diversity.