Skip to content

Detect data types (date, time, datetime, country, address, currency and many more). A vibe coded library and CLI.

Notifications You must be signed in to change notification settings

brainless/detectype

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Detectype

A Rust library and CLI tool for detecting data types from strings, starting with comprehensive date detection.

Features

  • Date Detection: Supports multiple date formats including:

    • ISO 8601 (2023-12-25, 2023-12-25T10:30:00Z)
    • US formats (12/25/2023, 12-25-2023)
    • European formats (25.12.2023)
    • Unix timestamps (1703462400, 1703462400000)
    • Year-only (2023)
    • RFC formats (RFC 2822, RFC 3339)
    • Multi-language support:
      • English: "January 15 2023", "15 January 2023"
      • Spanish: "15 enero 2023", "25 de diciembre de 2023"
  • High Confidence Scoring: Each detection includes a confidence score

  • Extensive Testing: Unit tests, property-based tests, and fuzzing

  • CLI Tool: Command-line interface for batch processing

Usage

As a Library

use detectype::{detect_type, DataType};

let result = detect_type("2023-12-25");
assert_eq!(result, DataType::Date);

let result = detect_type("hello world");
assert_eq!(result, DataType::String);

CLI Tool

# Single input
detectype "2023-12-25"

# With detailed information (including language detection)
detectype --verbose "15 enero 2023"
# Output: Date (Date format: DAY_MONTH_YEAR_ES (confidence: 0.85, language: es))

# From file
detectype --file dates.txt

# From stdin with verbose output
echo "25 de diciembre de 2023" | detectype --stdin --verbose

Installation

cargo build --release

Testing

# Run all tests
cargo test

# Run property-based tests
cargo test --test date_property_tests

# Run integration tests  
cargo test --test integration_tests

Supported Date Formats

Format Example Pattern
ISO 8601 Date 2023-12-25 YYYY-MM-DD
ISO 8601 DateTime 2023-12-25T10:30:00Z YYYY-MM-DDTHH:MM:SSZ
US Date (slash) 12/25/2023 MM/DD/YYYY
US Date (dash) 12-25-2023 MM-DD-YYYY
European Date 25.12.2023 DD.MM.YYYY
Unix Timestamp 1703462400 10 digits
Unix Timestamp (ms) 1703462400000 13 digits
Year Only 2023 YYYY
English Natural January 15 2023 Month Day Year
English Natural 15 January 2023 Day Month Year
Spanish Natural 15 enero 2023 Day Month Year
Spanish Natural 25 de diciembre de 2023 Day de Month de Year

Architecture

  • src/lib.rs - Main library interface
  • src/detectors/date.rs - Date detection implementation
  • src/error.rs - Error handling
  • src/types.rs - Type definitions
  • src/bin/main.rs - CLI tool
  • tests/ - Comprehensive test suite

Multi-Language Support

The library currently supports date detection in:

  • English: Full and abbreviated month names (January/Jan, February/Feb, etc.)
  • Spanish: Full and abbreviated month names (enero/ene, febrero/feb, etc.)

Language Detection Features:

  • Automatic language detection based on month names
  • Handles ambiguous abbreviations (prioritizes English for conflicts)
  • Supports Spanish prepositions ("de") in date formats
  • Case-insensitive matching
  • Confidence scoring per language

Adding New Languages:

The architecture is designed for easy language extension. To add a new language:

  1. Add month name mappings in src/detectors/date.rs
  2. Add regex patterns for language-specific formats
  3. Update the parsing logic in parse_natural_language_date
  4. Add comprehensive tests

Future Enhancements

  • Integer detection
  • Float detection
  • Boolean detection
  • Email detection
  • URL detection
  • Phone number detection
  • Credit card detection
  • Additional languages (French, German, Italian, etc.)

About

Detect data types (date, time, datetime, country, address, currency and many more). A vibe coded library and CLI.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages