← Back to Blog

1. What is the primary purpose?

Performs a comprehensive analysis of the provided text based on a set of predefined questions.

import sys

def analyze_text(text): """ Performs a comprehensive analysis of the provided text based on a set of predefined questions. This function is designed to extract key information, understand the context, and identify the core elements of the input text. """

analysis_results = { "purpose": "Not identified", "mechanism": "Not identified", "components": [], "key_features": [], "target_audience": "Not identified", "benefits": [], "system_role": "Not identified", "constraints": [], "technologies_algorithms": [], "real_world_app": "Not identified" }

clean_text = " ".join(text.split())

purpose_keywords = ["aim", "goal", "purpose", "objective", "intended to", "designed to", "serves to", "meant to", "function of"] for kw in purpose_keywords: if kw in clean_text.lower():

sentences = text.split('.') for sentence in sentences: if kw in sentence.lower(): analysis_results["purpose"] = sentence.strip() break break

mechanism_keywords = ["by", "through", "using", "via", "by means of", "utilizing", "employs", "leverages", "enables", "facilitates"] for kw in mechanism_keywords: if kw in clean_text.lower(): sentences = text.split('.') for sentence in sentences: if kw in sentence.lower(): analysis_results["mechanism"] = sentence.strip() break break

component_keywords = ["consists of", "includes", "comprises", "contains", "layers", "elements", "components", "parts"] for kw in component_keywords: if kw in clean sentences = text.split('.') for sentence in sentences: if kw in sentence.lower():

analysis_results["components"].append(sentence.strip()) break break

feature_keywords = ["features", "characteristics", "properties", "key", "noteworthy", "important", "essential", "crucial"] for kw in feature_keywords: if kw in clean_text.lower(): sentences = text.split('.') for sentence in sentences: if kw in sentence.lower(): analysis_results["key_features"].append(sentence.strip()) break

audience_keywords = ["for", "target", "audience", "users", "intended for", "designed for", "aimed at"] for kw in audience_keywords: if kw in clean_text.lower():

sentences = text.split('.') for s in sentences: if kw in s.lower(): analysis_results["target_audience"] = s.strip() # Using a different key for clarity in results break break

benefit_keywords = ["benefits", "advantages", "improves", "enhances", "facilitates", "enables", "reduces", "increases", "optimizes"] for kw in benefit_keywords: if kw in clean_text.lower(): sentences = text.split('.') for s in sentences: if kw in s.lower(): analysis_results["benefits_found"] = s.strip() break break

constraint_keywords = ["limitations", "constraints", "challenges", "difficulties", "problems", "issues", "drawbacks", "obstacles"] for kw in constraint_keywords: if kw in clean_text.lower(): sentences = text.split('.') for s in sentences: if kw in s.lower(): analysis_results["constraints_found"] = s.strip() break break

tech_keywords = ["using", "implemented with", "built with", "powered by", "leveraging", "utilizing"] for kw in tech_keywords: if kw in clean_text.lower(): sentences = text.split('.') for s in sentences: if kw in s.lower(): analysis_results["tech_used"] = s.strip() break break

return analysis_results

def print_analysis(analysis): print("-" * 30) print("TEXT ANALYSIS REPORT") print("-" * 30)

labels = { "purpose": "Primary Purpose", "mechanism": "Mechanism/How it works", "components": "Key Components/Layers", "key_features": "Key Features", "target_audience": "Target Audience", "benefits_found": "Identified Benefits", "constraints_found": "Identified Constraints/Challenges", "tech_used": "Technologies/Tools Used" }

for key, label in labels.items(): if key in analysis: if isinstance(analysis[key], list): print(f"{label}:\n - " + "\n - ".join(analysis[key])) else: print(f"{label}: {analysis[key]}") else: print(f"{label}: Not explicitly stated")

print("-" * 30)

if __name__ == "__main__":

sample_text = """ The Convolutional Neural Network (CNN) is a type of deep learning model specifically designed for processing structured arrays of data, such as images. CNNs leverage the power of convolutional layers to automatically and adaptively learn spatial hierarchies of features, from low-level edges to high-level objects. This architecture significantly improves image recognition accuracy compared to traditional methods. However, CNNs face challenges such as high computational requirements and the need for large datasets for training. This implementation is built with Python and utilizes the TensorFlow library to optimize training performance on GPUs. """

results = analyze_text(sample_text)

print_analysis(results)


        
        ### Explanation of the Implementation
        
        The provided Python script is a rule-based Natural Language Processing (NLP) tool designed to extract structured information from unstructured text. Here is a breakdown of how it works:
        
        #### 1. **Approach: Rule-Based Extraction**
        Since the goal is to extract specific pieces of information (Purpose, Mechanism, Participants, etc.) without needing the immense overhead of a large language model (like GPT-4), I used a **rule-based/heuristic approach**. 
        
        The script relies on:
        * **Keyword Anchoring:** It searches for "anchor words" (e.g., "benefits", "limitations", "built with") that usually signal the presence of specific information.
        * **Sentence Tokenization:** It breaks the text into individual sentences to isolate the context around the anchor words.
        * **Pattern Matching:** It uses these anchors to identify which sentence contains the relevant information.
        
        #### 2. **Key Functional Components**
        * **`analyze_text(text)`**: This is the core logic.
            * It iterates through several categories of information.
            * For each category, it looks for predefined trigger words.
            * When a trigger is found, it identifies the sentence containing that word and maps it to a logical key in a dictionary.
        * **`print_analysis(analysis)`**: This is the presentation layer.
            * It takes the raw dictionary of results and formats it into a clean, human-readable report.
            * It handles cases where information might be missing ("Not explicitly stated").
        
        #### 3. **Complexity & Performance**
        * **Time Complexity:** $O(N \times K)$, where $N$ is the number of characters in the text and $K$ is the number of keywords being searched. This is extremely efficient and runs in near-instantaneous time even for large documents.
        * **Space Complexity:** $O(M)$, where $M$ is the amount of extracted information stored in the results dictionary.
        
        #### 4. **Limitations & Future Improvements**
        While this script is fast and lightweight, a rule-based approach has limitations:
        * **Dependency on Keywords:** If the text says "The way it works is..." instead of "The mechanism is...", the script might miss it unless "way" is added to the keyword list.
        * **Lack of Semantic Understanding:** It doesn't "understand" meaning; it only recognizes patterns.
        
        **To make this production-ready (Advanced Version):**
        1.  **Use SpaCy or NLTK:** Instead of simple string searching, use a proper NLP library to perform **Dependency Parsing**. This would allow the script to understand that "The technique improves accuracy" means "Accuracy" is the object being "Improved."
        2.  **Named Entity Recognition (NER):** Use NER to automatically identify "Technologies" (e.g., TensorFlow, Python) without needing a predefined list.
        3.  **Transformer Integration:** For highly complex documents, one might use a small transformer model (like BERT) specifically fine-tuned for "Question Answering" (SQuAD) to extract precisely the answers to the questions posed in your prompt.

        

Modern Project Management for Distributed Teams

PM Squared shares practical tools, templates, and lessons for PMs navigating remote work in 2026.

Browse Resources →