Home
Archives
Contemporary Islam
Analysis Results

Cite Quantitative Result

Choose your preferred citation style

Data Quantitative + interpretive resources

Publication Live

Contemporary Islam

Explore published quantitative and interpretive analysis insights.

Documents

100

Total Chunks

179

Total Words

146,500

Avg. Sentiment

Quantitative Analysis

Composite Comparison

Conclusion + Recommendations

Composite Cross-Group Comparison

completed

Holistic Analysis Across 2 Groups

Comparing

Gemini, OpenAI

Comprehensive Cross-Group Comparison: Gemini vs. OpenAI

1. Executive Overview

This analysis compares content generated by Gemini and OpenAI across the Contemporary Islam project, spanning word frequency, TF-IDF, sentiment, topic modeling, n-grams, co-occurrence, NER, text classification, network analysis, chunking, similarity, and framing/bias detection. The overarching finding is one of remarkable structural convergence with meaningful stylistic and distributional differences. Both providers cover the same thematic terrain -- Islamic finance, human rights, modernity, feminism, fashion, and environmentalism -- but diverge in vocabulary emphasis, writing style, topical granularity, and rhetorical strategy. Gemini produces more voluminous, rhetorically assertive text, while OpenAI generates more concise, structurally organized, and community-oriented content.

2. Key Cross-Group Differences

Vocabulary volume and emphasis. Gemini's top term "islamic" appears 627 times versus OpenAI's 402 -- a 56% higher frequency. Gemini also uses "muslim" nearly three times as often (161 vs. 57). Conversely, OpenAI emphasizes "religious" significantly more (251 vs. 175, a 43% increase), and introduces distinctive high-frequency terms absent from Gemini's top 20: "legal" (134), "community" (100), "modernity" (96), "public" (78), "social" (73), and "avoid" (60). This suggests OpenAI favors broader sociological and practical framing, while Gemini defaults more heavily to identity-label terminology.

TF-IDF distinctiveness. While both share "halal" as the top TF-IDF term (~4.8), OpenAI gives "circular" a substantially higher TF-IDF weight (3.78 vs. 2.65) and elevates "modernity" (2.50 vs. 1.37), "human rights" (2.21 vs. 1.30), "community" (1.37 vs. absent), and "online" (1.25 vs. absent). Gemini uniquely surfaces "fashion" (2.60), "economy" (2.09), "shura" (1.77), "green" (1.64), and "extremist" (1.29). OpenAI's TF-IDF profile is more legally and conceptually oriented; Gemini's is more thematically concrete.

Writing style and framing. The framing analysis reveals the starkest stylistic divergence. Gemini employs 2.5× more passive voice (199 vs. 79), 6× more intensifiers (31 vs. 5), and 28% more loaded terms (269 vs. 210). OpenAI uses slightly more hedging (17 vs. 15). Complexity grades are virtually identical (~15.1). This pattern indicates Gemini adopts a more assertive, declarative academic tone, while OpenAI writes with greater restraint and hedged precision.

Text classification. Gemini classifies 29.3% of content as Economy & Business versus OpenAI's 17.5%. OpenAI assigns a far larger share to Law & Security (32.5% vs. 20.2%) and Politics (10.0% vs. 5.1%). OpenAI also spreads across more categories (13 vs. 10), including Education, Health, and Quran & Revelation -- absent from Gemini's output.

NER entity distribution. Gemini identifies far more entities overall, particularly NORP (1,339 vs. 801), PERSON (426 vs. 32), GPE (292 vs. 25), and LAW (79 vs. 9). OpenAI's MONEY entities spike (523 vs. 423), driven by structured numbered formatting (### 1, ### 2, etc.) misclassified as monetary entities -- a notable artifact of OpenAI's list-heavy formatting style.

3. Cross-Feature Patterns

Consistent pattern: OpenAI emphasizes legal/community framing. The word "legal" (134 in OpenAI, absent from Gemini's top 20), "community" (100 vs. absent), and "public" (78 vs. absent) align with OpenAI's higher TF-IDF for "human rights" and its classification lean toward Law & Security. Topic modeling reinforces this: OpenAI produces a dedicated "Human Rights and Islamic Law" topic at 12.6% prevalence with "legal" as its third keyword.

Consistent pattern: Gemini is more geographically and personally specific. NER shows Gemini identifying 292 GPE entities (Indonesia 31, Turkey 15, Europe 14) versus OpenAI's 25, and 426 PERSON entities (Muhammad 26, Maqasid al-Sharia 14) versus 32. This aligns with Gemini's higher "muslim" frequency and suggests Gemini grounds arguments in specific people, places, and historical contexts, while OpenAI abstracts toward principles and frameworks.

Reinforcing finding: Topic granularity. OpenAI's LDA generates 13 interpretable topics versus Gemini's 6 dominant ones. OpenAI separates "Islam and Feminism Discourse" (5.0%) from "Human Rights and Islamic Law" (12.6%), while Gemini merges these into a broader "Islam, Modernity, and Social Reform" (37.4%). Both providers' dominant macro-theme clusters (~70-96%) center on Islamic/religious/modern content, but OpenAI provides finer-grained topical differentiation.

Contradiction: Sentiment vs. Framing. Both providers are overwhelmingly positive in sentiment (Gemini 86.9%, OpenAI 88.8%), yet both show substantial loaded terms (269/210). This tension suggests positivity manifests through value-laden advocacy language rather than neutral reporting.

4. Similarities & Convergences

Co-occurrence networks are identical: same 50 nodes, 728 edges, density 0.5943, clustering 0.7080. The underlying source corpus drives identical structural relationships.
Top co-occurrence pairs match exactly (ijarah-leasing, artificial-intelligence, street-style, etc.), confirming shared source material.
Top n-grams converge: "references external sources" and "external sources used" lead both groups, revealing shared boilerplate reference sections.
Similarity clusters parallel: both form halal (7-8 docs), circular economy (7-8 docs), human rights (7 docs), and fashion (5-6 docs) clusters at comparable internal similarity scores (~0.20-0.30).
Sentiment polarity is nearly identical, with negligible variation (<2.5 percentage points).

5. Strategic Implications

Content differentiation is stylistic, not substantive. Both providers cover identical topics from the same sources. Stakeholders seeking varied topical coverage should not expect meaningful differences between providers.
Gemini is preferable for richly contextualized, historically grounded content -- it names more people, places, and specific legal frameworks, and clusters content into broader narrative arcs.
OpenAI is preferable for structured, legally precise, community-oriented content -- its list-based formatting, legal vocabulary emphasis, and finer topic segmentation suit policy analysis and practical guidance documents.
Bias monitoring should focus on Gemini, which uses 2.5× more passive constructions and 6× more intensifiers, potentially obscuring agency or amplifying claims.
Both providers generate near-duplicate boilerplate ("No external sources used") that inflates similarity scores and should be filtered in production workflows.
OpenAI's formatting artifacts (numbered headers misclassified as MONEY entities) require post-processing cleanup for NER-dependent applications.

Composite Conclusion and Recommendations

completed

Strategic Direction Across 2 Groups

Based on

Gemini, OpenAI

Conclusion

The comparative analysis of Gemini and OpenAI outputs across the Contemporary Islam project reveals a corpus that is thematically unified but structurally differentiated by provider. Both groups engage the same source material and converge on core thematic pillars -- Islamic finance and the circular economy, human rights under Islamic law, modest fashion, feminism, political governance, environmental stewardship, and the halal economy. However, meaningful differences emerge in vocabulary distribution, topical granularity, classification emphasis, entity recognition depth, content redundancy patterns, and rhetorical framing.

Vocabulary and Term Weighting

Gemini produces a notably higher raw frequency for the anchor term "islamic" (627 vs. 402), suggesting longer or more repetitive elaboration around core Islamic concepts. OpenAI, by contrast, surfaces terms absent from Gemini's top-20 -- "legal" (134), "community" (100), "public" (78), "social" (73), and "avoid" (60) -- pointing to a more practical, action-oriented, and community-centered lexicon. TF-IDF analysis reinforces this: Gemini weights "islamic" (3.42) far above OpenAI (1.81), while OpenAI elevates "circular" (3.78 vs. 2.65), "modernity" (2.50 vs. 1.37), "human rights" (2.21 vs. 1.30), and "community" (1.37, absent in Gemini's top-20). OpenAI thus distributes semantic emphasis more evenly across sub-themes, whereas Gemini concentrates weight on the overarching "Islamic" identifier.

Topic Structure and Coverage

Gemini's LDA model yields six primary topics with a single macro-cluster absorbing 95.8% of prevalence (Islam/Modernity/Social Reform at 37.4%), leaving democracy and political pluralism as an outlier at only 4.2%. OpenAI resolves into 13 topics with more balanced distribution: Islamic Governance and Political Modernity (26.7%), Halal Finance and Circular Economy (14.5%), Human Rights and Islamic Law (12.6%), and several mid-range topics between 5-10%. This finer-grained decomposition makes OpenAI's output more navigable for researchers seeking discrete thematic entry points. Both groups flag overlapping topics requiring consolidation, particularly around religious identity, legal frameworks, and community/digital practice.

Sentiment and Classification

Sentiment distributions are closely aligned -- Gemini at 86.9% positive and OpenAI at 88.8% -- confirming that both providers maintain a constructive, advocacy-adjacent tone. Classification diverges more sharply: Gemini assigns 29.3% of content to Economy & Business and 20.2% to Law & Security, while OpenAI inverts the weighting (17.5% Economy & Business, 32.5% Law & Security) and activates additional categories including Education (3.8%), Health (1.3%), and Quran & Revelation (1.3%). OpenAI's broader classification schema captures more disciplinary diversity from the same corpus.

Entity Recognition and Structural Signals

Gemini identifies significantly more named entities overall (4,316 vs. 2,173), with richer geographic (Indonesia 31, Turkey 15, Europe 14) and person-level tagging (Muhammad 26, Maqasid al-Sharia 14). OpenAI's NER is comparatively sparse -- only one person entity (Islam, 11) and one geographic entity reached the top lists -- and over-generates MONEY-type entities through heading markers (###), suggesting less robust preprocessing. Co-occurrence networks are identical between groups (density 0.59, clustering 0.71), confirming shared structural topology at the network level.

Redundancy, N-Grams, and Framing

OpenAI exhibits three near-duplicate chunk pairs (including one perfect 1.0 similarity) versus Gemini's single near-duplicate, driven by identical "No external sources used" boilerplate. OpenAI's n-gram profile is substantially richer (16 significant collocations vs. 6), surfacing domain-specific phrases like "human rights lens," "goes wrong," "street style," and "actionable checks" that indicate more varied phraseological output. On framing, Gemini uses far more passive voice (199 vs. 79) and intensifiers (31 vs. 5), while OpenAI relies slightly more on hedging (17 vs. 15). Both average a college-level complexity grade (~15), but Gemini's higher passive and intensifier counts suggest a more formal, sometimes less direct rhetorical style. High-bias resource profiles differ: Gemini flags theology-and-extremism pieces, while OpenAI flags radicalization and intersectional rights content.

Recommendations

High Priority

Deduplicate boilerplate content across OpenAI outputs. Three near-duplicate pairs -- including an exact 1.0 match -- stem from "No external sources used" reference blocks. These inflate similarity metrics and distort redundancy analysis. Implement post-generation stripping or chunking rules that exclude formulaic reference sections before analysis.

Standardize NER preprocessing for OpenAI. The heavy MONEY-type entity counts driven by markdown heading symbols (###, #) indicate that OpenAI's raw output is not being adequately cleaned before entity extraction. Apply regex-based header stripping to ensure NER results reflect genuine named entities rather than formatting artifacts.

Leverage OpenAI's finer topic resolution for thematic navigation. OpenAI's 13-topic model provides more actionable segmentation than Gemini's 6-topic model, where a single macro-cluster dominates. For research portals or content indexes targeting Contemporary Islam, adopt OpenAI's topic structure as the primary navigation scaffold, supplemented by Gemini's broader contextual framing.

Medium Priority

Reduce passive voice density in Gemini outputs. At 199 passive constructions compared to OpenAI's 79, Gemini content may feel less direct and harder to parse for general audiences. If these outputs serve educational or public-facing purposes, apply editorial guidelines or post-processing prompts that favor active constructions.

Enrich OpenAI's geographic and biographical entity coverage. Gemini identifies 31 Indonesia references, 15 Turkey, and 26 Muhammad mentions; OpenAI surfaces almost none of these. For projects requiring geopolitical or historical-figure mapping, either supplement OpenAI outputs with Gemini-derived entity layers or adjust OpenAI prompting to elicit more specific proper nouns.

Consolidate overlapping topics flagged by both providers. Both LDA models identify redundancy between religious identity, legal frameworks, and digital community topics. Merging these into consolidated super-topics would reduce noise and improve coherence scores, particularly for Gemini's zero-prevalence outlier topics (Topics 7-10).

Low Priority

Expand OpenAI's classification taxonomy for reuse. OpenAI activates 13 classification categories versus Gemini's 10, including Education, Health, and Quran & Revelation. Consider adopting this broader schema as a project-wide standard to capture disciplinary nuances that Gemini's coarser classification misses.

Monitor loaded-term density in sensitive sub-corpora. Both providers concentrate high-bias scores in radicalization, extremism, and rights-intersection content. For downstream publication or training use, flag these resources for manual review to ensure balanced framing, particularly Gemini's "Distinguishing mainstream theology from extremist ideology" (bias score 5.55) and OpenAI's "How online networks accelerate Islamic radicalization" (bias score 5.87).