PYTHON
Extracting Hashtags from Text
Discover how to extract all hashtags (e.g., #webdev, #python) from a given string using Python's `re` module, ideal for parsing social media content or user inputs.
import re
def extract_hashtags(text):
# Find all occurrences of # followed by one or more word characters (letters, numbers, underscores)
# re.findall returns a list of all non-overlapping matches
hashtags = re.findall(r'#(\w+)', text)
return hashtags
message = "This is a great #post about #python and #regex. #webdev rocks!"
extracted = extract_hashtags(message)
print(extracted) # Output: ['post', 'python', 'regex', 'webdev']
message_no_hashtags = "No hashtags here."
print(extract_hashtags(message_no_hashtags)) # Output: []
How it works: This Python function `extract_hashtags` utilizes the `re.findall` method to locate all hashtags within a given text. The regular expression `r'#(\w+)'` specifically looks for a hash symbol (`#`) followed by one or more "word characters" (`\w+`, which includes letters, numbers, and underscores). The parentheses around `\w+` create a capturing group, so `findall` returns only the text matched by this group (i.e., the hashtag content without the `#` symbol itself).