Python Word Counter, Count Words in Text with Whitespace Handling

💡 Key Takeaways : Word Counting Functions

This exercise includes a core function for word count and an extended one for extras like chars without spaces and sentences via punctuation. It’s a clean demo of text prep: strip edges, split on spaces, len for count. We’ll cover: basic function with trim and split, extended with replace and sum, and example showing outputs.

1. Basic Counter: Trim and Split Logic

The word_counter function takes text, cleans, splits, returns count:

def word_counter(text: str) -> int:
    """
    Count the number of words in the given text.
    """
    # Remove leading and trailing whitespaces
    clean_text = text.strip()

    # Split the text by spaces (handles multiple spaces between words)
    words = clean_text.split()

    # Return the number of words
    return len(words)

strip() removes outer spaces, split() handles multiples as one, avoiding empty entries. Len gives word count. Simple, handles ” a b ” as 2.

2. Extended Counter: Add Chars and Sentences

The word_counter_extended adds more metrics:

def word_counter_extended(text: str) -> tuple:
    """
    Count words, characters (without spaces), and sentences in the text.
    Returns a tuple: (words_count, chars_count, sentences_count)
    """
    clean_text = text.strip()
    words = clean_text.split()
    words_count = len(words)

    # Count characters excluding spaces
    chars_count = len(clean_text.replace(" ", ""))

    # Count sentences by looking for ., !, ?
    sentences_count = sum(clean_text.count(p) for p in ".!?")
    return words_count, chars_count, sentences_count

Reuses trim/split for words. replace(" ", "") removes spaces for char count. Sum counts ending punctuation for sentences. Returns tuple for multi-output.

3. Example Usage: Test with Sample

Run under main:

sample_text = "  Python   is    awesome!  "
print(sample_text)
print("Words:", word_counter(sample_text))
print("Extended:", word_counter_extended(sample_text))

For sample, words: 3, extended: (3, 15, 1). Shows handling spaces, punctuation.


🎯 Summary and Reflections

This word counter teaches text basics, from cleaning to counting. It reminded me:

  • Whitespace tricks: Strip/split manage messiness.
  • Multi-metrics: Extend for chars/sentences easily.
  • Tuple returns: Pack multiple values neatly.

Great for logs or essays. For more, handle hyphens or quotes.

Advanced Alternatives: Use regex for words: len(re.findall(r’\w+’, text)), or Counter for freq. Your text tip? Comment!

更多