Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Extract

Project Logo

Welcome to the documentation for extract, a command-line utility that parses text for network identifiers such as IP addresses, CIDR ranges and MAC addresses.

Features

  • IPv4 and IPv6 extraction
  • CIDR block recognition
  • MAC address matching
  • IP range parsing
  • Streaming operation for large files

Installation

See the project README for platform-specific binaries or build from source:

cargo build --release

Getting Started

Run extract with some input text to see immediate results:

echo 'Access from 192.168.1.1 to 10.0.0.0/8' | extract

Output:

192.168.1.1
10.0.0.0/8

The following chapters provide more examples and advanced usage tips.

Examples

The following snippets demonstrate common usage patterns.

# Basic extraction from a string
echo "Server at 192.168.1.1 connected" | extract
# Process a log file and remove duplicates
extract /var/log/firewall.log | sort | uniq
# Interactive mode with your editor
extract
# type or paste text, then save and exit
# Combine with grep to filter for a specific network
extract syslog.txt | grep "192.168.1."

Advanced Tips

  • Use custom regular expressions in config.toml to match proprietary patterns.
  • Combine extract with other Unix tools for powerful pipelines.
  • For performance benchmarking run cargo bench --bench performance.

Custom Configuration

Create ~/.config/extract/config.toml to tweak behaviour. You can supply your own regex patterns and adjust logging level. For quick one-off patterns, use the --regex flag on the command line and repeat it for multiple patterns:

log_level = "debug"
[custom_regexes]
"SERIAL\\d+" = "$0"

These patterns match anywhere on a line. Add word boundaries or anchors to prevent capturing text across multiple tokens.

log_level controls verbosity. At info level, every match is reported. Use debug for detailed processing steps showing how tokens are parsed and why rules may not match.

Benchmarking

Run Criterion benchmarks to gauge performance over time:

cargo bench --bench performance

Benchmarks print throughput statistics that help detect regressions.