JAVASCRIPT

Extract All Words from a String

Learn to extract an array of words from any text string, effectively ignoring numbers, punctuation, and extra whitespace, using JavaScript regex.

function extractWords(text) {
  // Matches sequences of Unicode letters. \b ensures whole words.
  const wordRegex = /\b[A-Za-zÀ-ÿ]+\b/g; // Added À-ÿ for common international characters
  const matches = text.match(wordRegex);
  return matches || []; // Return an empty array if no matches
}

// Examples
const sentence = "Hello world! How are you in 2023? I'm fine.";
console.log(extractWords(sentence)); // [ 'Hello', 'world', 'How', 'are', 'you', 'I'm', 'fine' ]

const unicodeText = "Français español übermensch cafés";
console.log(extractWords(unicodeText)); // [ 'Français', 'español', 'übermensch', 'cafés' ]

const noWords = "123 !@# $ %";
console.log(extractWords(noWords)); // []
How it works: The `extractWords` JavaScript function uses the regular expression `/\b[A-Za-zÀ-ÿ]+\b/g` to find and extract all "words" from a given string. The `\b` (word boundary) ensures only whole words are matched, and `[A-Za-zÀ-ÿ]+` matches one or more letter characters, including common accented characters for broader international support. The `g` flag ensures all matches are found, returning an array of extracted words. If no words are found, it returns an empty array.

Need help integrating this into your project?

Our team of expert developers can help you build your custom application from scratch.

Hire DigitalCodeLabs