VisualizationSystem is a data analysis application that allows to cluster and visualize multi-parameter data using graphs and plots. The application provides flexible configuration options for data analysis, clustering and visualization.
- Data Import: Supports importing Excel files with automatic detection of the data structure.
- Multi-tab Interface: Each data view or visualization opens in its own tab.
- Advanced Comparison Algorithm: Compares objects based on numerical and categorical parameters, allowing multiple values per cell (separated by commas).
- Flexible Similarity Settings: Allows to adjust comparison thresholds, set parameter weights, and enable/disable parameters.
- Multiple Clustering Algorithms: Implemented clustering methods include:
- K-means.
- DBSCAN.
- Hierarchical Agglomerative Clustering (HAC).
- Builds connections between nodes using full pairwise comparison algorithm.
- Configurable similarity thresholds.
- Edges colored by similarity percentage (green for high similarity, red for low).
- Double-click node to view detailed object information. Provides comprehensive view of object parameters and comparison results.
- Node color intensity reflects number of connections.
- No connection created if similarity is below threshold.
- Nodes colored by cluster membership.
- Connections show strongest relationships between objects (can be below threshold).
-
Scatter Plots:
-
Cluster Blocks:
Custom-realization with the following features:
- Data normalization using MinMax method for numerical data
- One-Hot encoding for categorical data
- Customizable algorithm parameters:
- K-means:
- Number of clusters,
- Maximum iterations.
- DBSCAN:
- Epsilon (maximum distance between points),
- Minimum points for cluster formation.
- Hierarchical Agglomerative Clustering (HAC):
- Merge threshold.
- K-means:
- Support for multiple distance metrics:
- Euclidean distance for numeric.
- Hamming distance for categorical.
The application leverages several external libraries to enhance data visualization and processing:
- ExcelDataReader: Excel file parsing.
- Microsoft MSAGL: Graph visualization.
- ScottPlot: Scatter plot rendering.
- ML.NET: Using PCA method.
The application is perfectly suited for analyzing data from the Fragile States Index (https://fragilestatesindex.org/excel/). These datasets provide rich, multi-parameter information that can be effectively processed and visualized using application.
- Provider: MS SQL Server
- Default database name:
node_objects_db
- Connection string configuration required for non-local server instances
- Data Menu:
- "Load Excel": Import data from Excel file.
- "View data": Open tab with table view of the current dataset.
- "Datasets": Select or delete any loaded dataset.
- Visualization Menu:
- "Build graph": Open tab with generated graph based on current settings.
- "Build plot": Open tab with generated plot based on current settings.
- "Settings": Open settings menu.