Module std::io::scanner

Line-by-line and word-by-word scanner (bufio.Scanner-style).

Scanner iterates through tokens from a string source, stdin, or a whole file. The default split function is SplitLines — each call to scan() advances past one newline-terminated line and returns the updated scanner state. Use with_split to switch to SplitWords (whitespace-separated tokens).

Basic iteration

import std::io::scanner;

fn main() {
    var sc = scanner.from_string("one\ntwo\nthree");
    sc = scanner.scan(sc);
    while scanner.has_next(sc) {
        println(scanner.text(sc));
        sc = scanner.scan(sc);
    }
}

Batch helpers

lines() and words() return a Vec<string> directly:

import std::io::scanner;

fn main() {
    let ls = scanner.lines("foo\nbar\nbaz");
    for i in 0 .. ls.len() {
        println(ls.get(i));
    }
}

Using with files and stdin

from_stdin() reads all standard input before scanning. from_file() returns Result<Scanner, fs.IoError> and reads the whole file before scanning:

import std::fs;
import std::io::scanner;

fn main() {
    match scanner.from_file("input.txt") {
        Ok(sc0) => {
            var sc = sc0;
            sc = scanner.scan(sc);
            while scanner.has_next(sc) {
                println(scanner.text(sc));
                sc = scanner.scan(sc);
            }
        },
        Err(e) => panic(fs.io_error_message(e)),
    }
}

Split modes

ModeDescription
SplitLinesLines split on \n; \r\n stripped (default).
SplitWordsFields split on any run of ASCII whitespace.

Contents

Functions

Function from_string

pub fn from_string(s: string) -> Scanner

Create a scanner over s using the default SplitLines mode.

Call scan() to advance to the first token.

Examples

import std::io::scanner;

fn main() {
    var sc = scanner.from_string("alpha\nbeta\ngamma");
    sc = scanner.scan(sc);
    while scanner.has_next(sc) {
        println(scanner.text(sc));
        sc = scanner.scan(sc);
    }
}

Function from_stdin

pub fn from_stdin() -> Scanner

Create a scanner over all data read from standard input.

This v0.5 implementation reads stdin to completion before scanning, rather than looping on io.read_line(), so EOF is not confused with a real empty line.

Function from_file

pub fn from_file(path: string) -> Result<Scanner, fs.IoError>

Read path into memory and create a scanner over its contents.

Returns Err(fs.IoError) if the file cannot be opened or read.

Function with_split

pub fn with_split(sc: Scanner, mode: SplitMode) -> Scanner

Override the split mode on an existing scanner, resetting to position 0.

Resets the cursor to the start of the source. Call scan() after with_split to advance to the first token.

Examples

import std::io::scanner;

fn main() {
    var sc = scanner.from_string("one two  three");
    sc = scanner.with_split(sc, scanner.SplitWords);
    sc = scanner.scan(sc);
    while scanner.has_next(sc) {
        println(scanner.text(sc));
        sc = scanner.scan(sc);
    }
}

Function scan

pub fn scan(sc: Scanner) -> Scanner

Advance to the next token and return the updated scanner.

Check has_next() on the returned value; if true, text() gives the current token. Returns the scanner unchanged once exhausted.

Examples

import std::io::scanner;

fn main() {
    var sc = scanner.from_string("a\nb\nc");
    sc = scanner.scan(sc);
    while scanner.has_next(sc) {
        println(scanner.text(sc));
        sc = scanner.scan(sc);
    }
}

Function next_line

pub fn next_line(sc: Scanner) -> (Scanner, Option<string>)

Advance in line mode and return the updated scanner with the next line.

This helper ignores the current split mode and always uses SplitLines. A real empty line returns Some(""); EOF returns None.

Function has_next

pub fn has_next(sc: Scanner) -> bool

Return true if the last scan() produced a token.

Returns false on a freshly constructed scanner (before the first scan()) and after exhaustion.

Function text

pub fn text(sc: Scanner) -> string

Return the current token text.

Valid when has_next() is true; returns an empty string otherwise.

Function lines

pub fn lines(s: string) -> Vec<string>

Return all lines from s as a Vec<string>.

Strips \r\n and \n endings. A trailing newline does not produce a trailing empty element — only content lines are emitted (Go bufio.Scanner semantics). An empty input returns an empty vector.

For trailing-empty-element behaviour, use std::string.lines().

Examples

import std::io::scanner;

fn main() {
    let ls = scanner.lines("foo\nbar\n");
    // ls == ["foo", "bar"]
    println(ls.len());
}

Function words

pub fn words(s: string) -> Vec<string>

Return all whitespace-delimited words from s as a Vec<string>.

Any run of ASCII whitespace (space, tab, CR, LF) acts as a separator. Leading and trailing whitespace is ignored.

Examples

import std::io::scanner;

fn main() {
    let ws = scanner.words("  hello   world  ");
    // ws == ["hello", "world"]
    println(ws.len());
}

Function collect

pub fn collect(sc: Scanner) -> Vec<string>

Collect all remaining tokens from sc into a Vec<string>.

If has_next(sc) is already true (i.e. scan() was called before collect), the current token is included as the first element. Advances the scanner to exhaustion.

Examples

import std::io::scanner;

fn main() {
    var sc = scanner.from_string("a\nb\nc\nd");
    sc = scanner.scan(sc);  // "a"
    sc = scanner.scan(sc);  // "b"
    let rest = scanner.collect(sc);
    // rest == ["b", "c", "d"] (current token + remaining)
}

Types

Enum SplitMode

How the scanner tokenises its source.

Variants

SplitLines

Lines split on \n; a leading \r before \n is stripped.

SplitWords

Tokens split on any run of ASCII whitespace (space, tab, \r, \n).

Struct Scanner

Scanner state value.

Hold a var sc: Scanner and drive it with scan(); see module docs for full usage.

Fields

source: string
pos: i64

Byte position of the next unconsumed character.

current: string

Text of the current token. Valid after scan() when has_next() is true; empty string before the first scan() or after exhaustion.

valid: bool

True when the last scan produced a token. Required so a real empty line ("\n") remains distinguishable from EOF.

done: bool

True when all tokens have been emitted.

mode: SplitMode

Token split mode; fixed at construction or changed via with_split.