Extract

Welcome to the documentation for extract, a command-line utility that parses text for network identifiers such as IP addresses, CIDR ranges and MAC addresses.
Features
- IPv4 and IPv6 extraction
- CIDR block recognition
- MAC address matching
- IP range parsing
- Streaming operation for large files
Installation
See the project README for platform-specific binaries or build from source:
cargo build --release
Getting Started
Run extract with some input text to see immediate results:
echo 'Access from 192.168.1.1 to 10.0.0.0/8' | extract
Output:
192.168.1.1
10.0.0.0/8
The following chapters provide more examples and advanced usage tips.
Examples
The following snippets demonstrate common usage patterns.
# Basic extraction from a string
echo "Server at 192.168.1.1 connected" | extract
# Process a log file and remove duplicates
extract /var/log/firewall.log | sort | uniq
# Interactive mode with your editor
extract
# type or paste text, then save and exit
# Combine with grep to filter for a specific network
extract syslog.txt | grep "192.168.1."
Advanced Tips
- Use custom regular expressions in
config.tomlto match proprietary patterns. - Combine
extractwith other Unix tools for powerful pipelines. - For performance benchmarking run
cargo bench --bench performance.
Custom Configuration
Create ~/.config/extract/config.toml to tweak behaviour. You can supply your
own regex patterns and adjust logging level. For quick one-off patterns, use
the --regex flag on the command line and repeat it for multiple patterns:
log_level = "debug"
[custom_regexes]
"SERIAL\\d+" = "$0"
These patterns match anywhere on a line. Add word boundaries or anchors to prevent capturing text across multiple tokens.
log_level controls verbosity. At info level, every match is reported. Use
debug for detailed processing steps showing how tokens are parsed and why
rules may not match.
Benchmarking
Run Criterion benchmarks to gauge performance over time:
cargo bench --bench performance
Benchmarks print throughput statistics that help detect regressions.