The article “SciMAT: A new science mapping analysis software tool” by M.J. Cobo, A.G. López-Herrera, E. Herrera-Viedma, and F. Herrera, published in 2012, introduces SciMAT (Science Mapping Analysis software Tool), an innovative open-source software designed to perform comprehensive science mapping analysis within a longitudinal framework. Science mapping, also known as bibliometric mapping, is a critical area within bibliometrics that provides a spatial representation of the intricate relationships between various academic entities, such as disciplines, fields, specialties, documents, and authors. Its primary goal is to monitor scientific fields, delimit specific research areas, and display the structural and dynamic evolution of scientific research. This type of analysis is valuable for uncovering hidden key elements in specific interest areas, serving both academic pursuits and competitive intelligence, such as patent analysis in R&D departments.
Prior to SciMAT, researchers often had to utilize multiple software tools—some general-purpose (e.g., Pajek, Gephi) and others specific but often ad hoc (e.g., CoPalRed, VOSviewer)—to complete a full science mapping workflow. A previous analysis by the authors concluded that no single existing tool was powerful and flexible enough to integrate all necessary steps: data retrieval, preprocessing, network extraction, normalization, mapping, analysis, visualization, and interpretation. SciMAT was developed to address this gap by incorporating methods, algorithms, and measures for every step of the general science mapping workflow.
SciMAT’s Three Key Distinguishing Features: SciMAT stands out from other science mapping tools due to three remarkable capabilities:
- Powerful Preprocessing Module: SciMAT implements an extensive range of tools for cleaning raw bibliographical data, which is crucial for obtaining reliable analysis results. This includes detecting duplicate and misspelled items, time slicing, and performing data and network reduction. The module allows users to import data from various bibliographical sources like ISI Web of Science (ISI-CE format) and Scopus (RIS format), as well as a specific CSV format. A core innovation is the use of “groups” for entities like Author, Word, and Reference, which facilitates the de-duplicating process by linking similar items under a single group, preventing data loss.
- Use of Bibliometric Measures for Impact Assessment: The tool allows for the enrichment of science maps with bibliometric measures to quantify the impact and quality of studied elements, such as clusters or evolution areas. SciMAT implements a wide range of citation-based measures, including the h-index, g-index, hg-index, and q2-index, which provide insights into the interest and impact of research within the specialized community.
- A Wizard to Configure the Analysis: SciMAT includes a user-friendly wizard that guides the analyst through the configuration of the various steps of science mapping analysis. This allows users to select specific measures, algorithms, and analysis techniques for each stage, providing flexibility and control over the analytical process.
Comprehensive Science Mapping Workflow in SciMAT: The SciMAT workflow is meticulously structured into four main stages, ensuring a thorough and configurable analysis:
- Build the Data Set:
- Period Selection: Users define and select time periods for longitudinal analysis, allowing for the study of structural evolution over time.
- Unit of Analysis Selection: The tool supports analyzing conceptual, social, or intellectual aspects by choosing from five types of groups: Author Group, Word Group, Reference Group, Author-Reference Group, or Source-Reference Group.
- Data Reduction: Users can filter data based on a minimum frequency threshold for units of analysis within each period, ensuring that only the most representative data is considered.
- Create and Normalize the Network:
- Network Construction: SciMAT facilitates the building of bibliometric networks using various relations such as co-occurrence, coupling, or aggregated coupling. This allows for the construction of 20 types of networks, including widely used ones like coauthor, cocitation, bibliographic coupling, and co-word networks, as well as more advanced ones.
- Network Reduction: An optional step allows filtering network edges based on a minimum threshold edge value for each period, retaining only the most significant relations.
- Normalization: The constructed network is normalized using various similarity measures, including association strength, Equivalence Index, Inclusion Index, Jaccard Index, and Salton’s cosine.
- Apply a Clustering Algorithm to Get the Map:
- SciMAT offers several clustering algorithms to build the science map, such as the Simple Centers Algorithm, Single-linkage, Complete-linkage, Average-linkage, and Sum-linkage.
- Apply a Set of Analyses:
- Network Analysis: By default, SciMAT automatically calculates Callon’s density and centrality measures for each detected cluster. Centrality quantifies external cohesion (interaction with other networks), while density measures internal strength (internal cohesion of the network). These measures are crucial for categorizing clusters within strategic diagrams.
- Performance Analysis: This feature quantifies the impact and quality of clusters. SciMAT associates sets of documents with clusters using document mapper functions, which include:
- Core document mapper: Returns documents present in at least two nodes.
- Secondary document mapper: Returns documents present in only one node.
- k-core document mapper: A generalization of the core mapper, returning documents present in at least ‘k’ nodes.
- Union () document mapper: Returns the algebraic union of documents associated with the subset of nodes.
- Intersection () document mapper: Returns the algebraic intersection of documents associated with the subset of nodes. Once documents are associated, a range of bibliometric measures (e.g., sum, minimum, maximum, and average citations, along with h-, g-, hg-, and q2-indices) are calculated to assess cluster quality and impact.
- Temporal or Longitudinal Analysis: This enables tracking the conceptual, social, or intellectual evolution of a field over consecutive time periods. SciMAT builds evolution maps to detect “evolution areas” and overlapping items graphs to visualize how items and themes persist or emerge across periods. Measures like association strength, Equivalence Index, Inclusion Index, Jaccard’s Index, and Salton’s cosine can be used to calculate the weight of “evolution nexus” between items in consecutive periods.
Visualization and Interpretation: SciMAT offers diverse visualization techniques to represent science maps and analysis results, facilitating understanding and interpretation:
- Strategic Diagrams: These two-dimensional plots categorize clusters based on their Callon’s density and centrality measures. They can be enriched by displaying bibliometric measures, with sphere volumes potentially representing citations achieved by each cluster.
- Cluster Networks: These show the graphical relationships between items within a specific cluster.
- Evolution Maps: These illustrate how clusters evolve through different periods, showing origins, interrelationships, and the emergence or discontinuation of themes.
- Overlapping Maps/Graphs: These visualize items shared by two consecutive periods, new items, and discontinued items, often including a “Stability Index”.
Analysis results can be exported in HTML or LaTeX formats for detailed reports. Images (strategic diagrams, evolution maps, etc.) can be exported in PNG and SVG formats for easy editing, and cluster networks and evolution maps are also available in Pajek format. The final and crucial step involves the analyst’s interpretation of these results to derive valuable insights and inform decision-making.
Architecture and Technology: SciMAT is developed in Java, ensuring cross-platform compatibility across Windows, MacOS, and Linux operating systems. Its architecture leverages an object-oriented methodology and various design patterns (e.g., Observable, Observer, Data Access Object, Data Transfer Object) to promote extensibility and maintainability. The knowledge base, which stores relations between entities such as Authors, Words, Journals, and References, is managed by SQLite, a serverless, cross-platform relational database management system known for its reliability and minimal system requirements. The core of SciMAT consists of a Model for database management and the SciMAT Application Programming Interface (API), which contains all necessary methods for science mapping analysis and is extensible by advanced users for custom algorithms or data loaders.
Broad Applicability and Validation: SciMAT’s versatility makes it applicable in numerous scenarios:
- Journal Editors: Can analyze journal topics, their evolution, and the impact (citations, productivity) of themes, identifying “hot” topics or suitable authors for special issues.
- Researchers: Can analyze their own research fields to identify influential themes, key references, and potential collaboration networks.
- Universities/Research Institutions: Can identify important and internationally relevant research areas, assess productivity and impact, and make informed decisions on resource allocation or journal subscriptions.
- Government Policy Makers: Can identify fruitful or promising research areas to focus resources and identify leading researchers and their collaborations.
- Business Intelligence Departments: Can uncover competitors’ hidden know-how from publications, identify principal researchers, and assess competitive positioning.
The tool underwent a rigorous user validation test with 15 diverse academic professionals, including senior researchers and PhD students, who used SciMAT with their own data sets across five different research topics. Feedback, gathered through an adapted version of the Questionnaire for User Interface Satisfaction (QUIS), focused on improving the interface, interactivity, navigation flow, and reporting capabilities. While some users initially found SciMAT more challenging to learn than other tools, they ultimately found it comfortable and effective after prolonged use. All suggestions were incorporated into Version 1.1 of SciMAT, enhancing its usability and functionality.
In comparison to other science mapping tools, SciMAT’s unique strengths lie in its configurable wizard, comprehensive integration of the entire workflow within a longitudinal framework, and its strong emphasis on quantifying results with impact measures.
Reference: Cobo, M. J., López-Herrera, A. G., Herrera-Viedma, E., & Herrera, F. (2012). SciMAT: A new science mapping analysis software tool. Journal of the American Society for Information Science and Technology, 63(8), 1609–1630. doi:10.1002/asi.22688.
