For the analyst, not the influencer

Data work
isnotebook work.

Hypotheses. Cohorts. A/B tests. Dashboards. Things you can defend.

By PromptLeadz · Reading time 18 minutes · 50 prompts across 5 categories · Calibrated for 2026 frontier models

The pack in seven sentences

50 free data analyst and data scientist prompts across 5 categories of 10 each: hypothesis and question framing, data wrangling and SQL discipline, analysis and experimentation, visualization and dashboards, stakeholder communication and influence.
Calibrated for the analyst who writes a findings memo with the answer in the first sentence. Not for the analyst who writes a LinkedIn thread about being data-driven.
Twelve data-influencer phrases banned at the prompt level: "data-driven" (used vaguely), "actionable insights", "deep dive", "low-hanging fruit", "moves the needle", "single source of truth" (cargo cult), "ai/ml-powered", "10x analyst", "data is the new oil", "story with data" (vague version), "democratize data", "self-service analytics" (as buzzword).
Each prompt produces an artifact: a pre-registered analysis plan, a SQL review memo, an A/B test design, a chart with the right encoding, a findings memo, a model card. Notebook-grade artifacts, not vibes.
Component-built on the 8-Component Skeleton (identity, context, task, constraints, examples, output format, refusal conditions, evaluation). Magic words and persona-prompts are explicitly excluded.
Pairs with the PM Pack for the cross-functional product work, the EM Pack for the engineering-side data infra, and the Operator Pack for the finance and ops dashboards.
Free, no email gate. The pack is the proof that components beat magic words. The Vault and All Ten Drop-ins Bundle are the production-grade versions with evaluation harnesses around the prompts.

What separates the analyst from the data-influencer

Analytics is one of the most LinkedIn-saturated disciplines in tech. Threads about being data-driven, the magic of actionable insights, the 10x analyst archetype, democratizing data across the org, and the single source of truth get tens of thousands of likes. The threads describe a vibe. The actual job is notebook work.

An analyst's or data scientist's primary job is producing decision-grade artifacts: pre-registered analysis plans that survive the team scientific review, SQL queries that pass code review on correctness and performance, A/B test designs with explicit kill criteria, charts encoded for the question being asked, findings memos that put the answer in the first sentence, model cards that document assumptions and known failure modes. None of these artifacts look exciting on a screenshot. All of them compound across quarters.

Six dimensions separate the analyst voice from the data-influencer voice. Substance: the analyst names the specific cohort, the time window, the metric definition; the influencer names the disposition (data-driven, insights-led, evidence-based). Statistical honesty: the analyst names the confidence interval, the assumptions, the limitations; the influencer names a single number and a headline. Methodology: the analyst writes the analysis plan before pulling the data; the influencer pulls the data and writes the conclusion that fits. Pushback: the analyst says the data cannot answer the question as asked; the influencer says yes and produces a number anyway. Tone: the analyst writes flat memos that hold up under scientific review; the influencer writes narrative arcs that go viral. Audience: the analyst writes for the product team, the exec sponsor, and the future analyst who reads the notebook; the influencer writes for the algorithm.

Both voices exist in the wild. Only one survives the post-launch analysis review, the methodology challenge from the senior analyst, and the year-end retrospective on which analyses actually changed decisions. This pack is calibrated for the first; it explicitly rejects the second at the prompt level by banning the genre's signature phrases inline. Output reads like a memo from an analyst who has just defended a finding in front of the PM and the CFO, not a thread from a personal-brand analyst who has not.

Five categories. The analyst workflow end to end.

The five categories map to the five operating disciplines that determine whether an analytics team produces decisions or accumulates dashboards no one reads. Hypothesis and Question Framing comes first because deciding what to measure is upstream of every analysis. Data Wrangling and SQL Discipline comes second because the SQL and dbt foundation is what determines whether the analysis is correct or merely confident. Analysis and Experimentation comes third because the statistical discipline is where the analyst either earns the team's trust or accumulates debt. Visualization and Dashboards comes fourth because chart encoding and dashboard design are where good analysis goes to die or compound. Stakeholder Communication and Influence comes fifth because the analyst who cannot translate findings into decisions produces shelfware reports.

Most analysts who fail to compound do so by skipping the unglamorous categories: hypothesis framing, query review, methodology pre-registration, pushback on bad asks. The thread-genre analyst skips these in favor of "data storytelling" content; the actual analyst does these because they are the leverage.

Category 01

Hypothesis and Question Framing

Ten prompts for the upstream work that determines whether the analysis answers a real question or produces an artifact no one acts on. The shape: vague asks translated into testable hypotheses, success metrics named, falsifiability enforced, confounders identified before the SQL runs, kill criteria explicit. Reject the "let me just pull the data" framing that produces analyses no one can defend.

Pairs with: 8-Component Skeleton

1. Translate a vague stakeholder ask into a testable hypothesis

Stakeholder ask: [paste the original request, ideally a verbatim quote]. Context I have: [paste]. Draft a 300-word translation memo: the question restated as a testable hypothesis ("users in segment X have higher Y than users in segment Z, by at least W effect size"), the metric definition with the named formula and aggregation, the cohort with explicit inclusion and exclusion rules, the time window with the named comparison period, the decision the answer would inform. Asks left as "is X working" produce analyses that everyone reads differently.

2. Define the success metric for a question

Question: [paste]. Available data: [paste tables and columns]. Draft a 300-word metric definition memo: the candidate metrics (typically 2 to 4), the rationale per metric, the recommended primary with explicit formula, the secondary metrics as guardrails, the gaming risks named, the segment cuts where the metric is meaningful, the cadence at which the metric is meaningful. Metrics defined as "engagement" or "impact" produce analyses that drift across stakeholders.

3. Frame a falsifiable hypothesis with named effect size

Hypothesis (informal): [paste]. Domain context: [paste, e.g. business priors, prior experiments]. Draft a 400-word hypothesis memo: the falsifiable form (specific direction, specific cohort, specific effect size that would matter, specific decision threshold), the null hypothesis, the minimum detectable effect that matters for the business decision, the named decision criteria for the result, the kill criterion (the result that would tell us the question is unanswerable). Hypotheses without effect sizes produce powered analyses that detect noise.

4. Identify the cohort and time window

Question: [paste]. User base: [paste size and segmentation]. Draft a 400-word cohort memo: the named inclusion criteria, the named exclusion criteria (especially for self-selection bias), the time window with the rationale for the window length, the comparison period for pre/post or A/B, the survivorship issues to control for, the cohort size at each step of filtering. Cohorts defined loosely produce analyses that quietly drop users without anyone noticing.

5. Pre-register the analysis plan before pulling data

Question: [paste]. Hypothesis: [paste]. Draft a 500-word pre-registration memo: the analysis plan with named steps (cohort definition, metric calculation, segmentation, statistical test, multiple-comparison adjustment if any), the decision rule (the specific result that triggers each downstream action), the planned figures and tables, the analyses we will NOT run (to limit p-hacking), the named reviewers and the sign-off date. Pre-registration plans are the difference between an analysis and a fishing expedition.

6. Identify the confounders before running the analysis

Question: [paste]. Treatment or independent variable: [paste]. Outcome: [paste]. Draft a 400-word confounder memo: the candidate confounders (variables that affect both treatment and outcome), the named control strategy per confounder (stratify, match, regress, randomize, restrict cohort), the unmeasured confounders that would invalidate the analysis, the sensitivity analysis I will run, the limitation paragraph for the final memo. Analyses without explicit confounder thinking produce correlations confused with causes.

7. Define the kill criterion for the analysis

Question: [paste]. Estimated time to complete: [paste]. Draft a 300-word kill criterion memo: the data quality threshold below which the analysis cannot answer the question (named, with the specific check), the sample size threshold below which the analysis is underpowered for the decision, the time-box (the analysis stops at X hours regardless of completeness), the alternative answer if the kill criterion is hit (qualitative research, smaller question, accept uncertainty). Analyses without kill criteria become endless investigations.

8. Scope the analysis (must-have vs nice-to-have)

Question: [paste]. Time available: [paste]. Stakeholder expectations: [paste]. Draft a 400-word scope memo: the must-have output (the one or two findings that answer the core question), the nice-to-have output (the additional cuts that would extend the analysis), the explicit out-of-scope items, the time estimate per slice, the trade-off with named stakeholder sign-off. Analyses scoped by the analyst's curiosity rather than the stakeholder's decision produce a notebook of unread cells.

9. Identify the decision the analysis informs

Question: [paste]. Stakeholders: [paste]. Draft a 300-word decision memo: the specific decision the answer enables (named, not generic), the named decision-maker, the decision deadline, the alternative outcomes the decision-maker is choosing between, the threshold or evidence quality the decision-maker needs. Analyses run without a named decision produce reports that get filed and not read.

10. Write the one-line headline answer in advance

Question: [paste]. Hypothesis: [paste]. Draft a 200-word headline-first memo: a one-sentence headline for each plausible outcome of the analysis (yes with high confidence, yes with caveats, no with high confidence, inconclusive), the supporting evidence each headline would require, the recommendation each headline would imply. Headlines written in advance force the analyst to face whether the data can actually distinguish between the outcomes.

Category 02

Data Wrangling and SQL Discipline

Ten prompts for the SQL and data-engineering work that determines whether the analysis is correct or merely confident. The shape: query review for correctness and performance, dbt model design with named tests, data quality audits, schema evolution plans, joining logic explicit, NULL and time-zone handling named. Reject the "this query runs so it must be right" framing.

Pairs with: EM Pack

11. SQL query review (correctness, performance, readability)

Query: [paste]. Tables involved: [paste schema with row counts]. Intended question: [paste]. Draft a 500-word query review: the correctness check (does the query answer the question, does the join logic preserve the intended rows, are aggregations applied at the right grain), the performance check (full table scans, missing indexes, inefficient joins, large CTEs that should be materialized), the readability check (named CTEs, explicit column lists, no SELECT star in production), the named rewrites with rationale. Queries that run are not the same as queries that are correct.

12. dbt model design memo

New model needed: [paste use case, downstream consumers]. Existing models: [paste relevant lineage]. Draft a 600-word dbt model design memo: the model layer (staging, intermediate, mart), the grain of the model (one row per X), the named tests (uniqueness, not-null, accepted values, referential integrity, custom assertions), the documentation per column, the upstream dependencies and freshness expectations, the downstream consumers and SLA. dbt models without tests are technical debt that ages into production incidents.

13. Data quality audit for a source table

Table: [paste name and schema]. Suspected issues: [paste]. Draft a 500-word data quality audit: the row count vs expected, the null counts per column with the threshold for concern, the duplicate analysis (by which keys), the time-series gaps (missing days, anomalous days), the categorical column distribution (unexpected new values), the foreign-key integrity, the named remediation per issue. Data quality issues found in production by a stakeholder are an order of magnitude more expensive than issues found by audit.

14. Schema evolution plan for a breaking change

Current schema: [paste]. Proposed change: [paste]. Downstream consumers: [paste]. Draft a 600-word schema evolution plan: the migration phases (dual-write, backfill, switch reads, retire old column), the communication to downstream consumers per phase with the named owner, the SLA for migration of each consumer, the rollback plan, the kill criterion (when to revert), the verification queries before each phase. Schema changes pushed without a migration plan break dashboards engineers cannot debug.

15. Backfill plan for missing or wrong historical data

Issue: [paste, e.g. event logging gap, wrong transformation in dbt, source system bug]. Time window affected: [paste]. Draft a 500-word backfill plan: the named impact (which tables, which downstream consumers, which dashboards, which analyses already run on the wrong data), the backfill source (replay from raw, recompute from corrected logic, manual reconstruction), the dual-running validation (compare backfill to current data on overlap windows), the comms plan to affected stakeholders, the cadence of verification. Backfills run without dual-running validation often introduce new bugs.

16. Deduplication strategy with named rules

Table: [paste]. Suspected duplicate pattern: [paste, e.g. same user_id with different timestamps, same event with retry flag]. Draft a 400-word deduplication memo: the named uniqueness key (the columns that define a unique row), the dedup rule per scenario (keep latest by timestamp, keep first, keep based on a non-null priority field), the edge cases (null values in the dedup key, ties on timestamp), the validation (compare row counts before and after, sample-check the dedup choices), the dbt implementation if applicable. Deduplication done by SELECT DISTINCT without a named key produces silent data loss.

17. Joining logic memo (named keys, named row impact)

Source tables: [paste with schemas and row counts]. Intended output grain: [paste]. Draft a 400-word join memo: the join type per pair (inner, left, full outer, with the named reason), the join keys (with the data type and null handling), the row impact (how many rows from each table survive, dropped, or fan out), the validation queries before and after the join, the cardinality check (one-to-one, one-to-many, many-to-many) with the explicit treatment. Joins written by intuition produce analyses that quietly drop or duplicate rows.

18. Time-zone and date handling discipline

Question involving dates or time: [paste]. Source data time-zone handling: [paste]. Draft a 400-word time-handling memo: the named time-zone for the analysis (typically UTC for warehouses, local for user-facing reports), the conversion logic, the day boundary rules (does "day" mean UTC day or user-local day), the daylight saving handling, the named edge cases (events near midnight, users in moving time-zones, historical time-zone rule changes), the verification query. Time-zone bugs in analyses produce headline numbers that drift from production.

19. NULL handling rules for the analysis

Analysis: [paste]. Columns that may be NULL: [paste with NULL rate per column]. Draft a 400-word NULL handling memo: the named rule per column (exclude rows, impute with named strategy, treat as a category, propagate NULL), the rationale per rule (does NULL mean missing or zero or unknown), the impact on the analysis (rows excluded, segments affected), the alternative analyses for sensitivity. NULL handling done by COALESCE without a named rule produces analyses with silent assumptions.

20. Sampling strategy for large tables

Table: [paste with row count, partitioning, indexing]. Analysis goal: [paste]. Draft a 400-word sampling memo: the named sampling method (random, stratified, deterministic by hash, time-based), the sample size with the rationale (sufficient for the statistical power needed), the bias risks per method, the validation of the sample against the full population (key distributions match), the named scale-up path if the analysis needs the full data. Samples drawn by LIMIT or RAND() without stratification produce biased segment analyses.

Most analytics advice circulating on LinkedIn is content marketing. The work that actually compounds is the pre-registered analysis plan and the SQL review that catches the join bug.PromptLeadz Data Analyst Pack

Category 03

Analysis and Experimentation

Ten prompts for the statistical and experimental work where the analyst either earns the team's trust or accumulates methodological debt. The shape: A/B tests with explicit kill criteria, power calculations done in advance, causal inference matched to the question, anomaly detection that distinguishes signal from noise, segment analyses with named cohorts. Reject the "the result is significant" framing that hides effect size and assumptions.

Pairs with: PM Pack

21. A/B test design with explicit kill criteria

Hypothesis: [paste]. Variants: [paste]. Population: [paste size and segmentation]. Draft a 600-word A/B test design: the falsifiable hypothesis with the minimum detectable effect, the primary metric and the guardrail metrics, the sample size calculation with stated power (typically 0.8) and alpha (0.05), the planned duration with the named stopping rule, the kill criteria (the result on guardrails that stops the test early), the analysis plan (frequentist or sequential, multiple-comparison handling), the decision rule with the named decision-maker. Tests run without kill criteria produce zombie experiments with no decision.

22. Sample size and statistical power calculation

Metric: [paste baseline rate and variance]. Minimum detectable effect: [paste]. Variants: [paste count]. Draft a 400-word power calculation memo: the formula used (with assumed alpha and power), the sample size per variant, the duration estimate based on traffic, the sensitivity (how sample size changes with effect size assumptions), the named trade-off (smaller effect detection requires more traffic and longer duration), the recommendation. Tests run with vibes-based sample sizes produce inconclusive results that everyone interprets favorably.

23. Cohort analysis with named segmentation

Behavior to analyze: [paste]. Cohort dimension: [paste, e.g. signup month, first-purchase channel, initial product]. Draft a 600-word cohort analysis memo: the cohort definition with named inclusion criteria, the time axis (calendar time vs cohort age), the metric tracked per cohort, the comparison (newer vs older cohorts, by segment, vs a benchmark), the patterns identified (improving, stable, declining), the explanatory hypotheses with named tests, the implication for product or marketing. Cohort analyses presented as a single average lose the time-since-acquisition signal.

24. Funnel analysis with drop-off cause hypotheses

Funnel stages: [paste with current numbers]. Period: [paste]. Draft a 500-word funnel analysis: the conversion rate at each stage with the historical comparison, the segment cuts where conversion differs materially, the hypothesized causes per major drop-off (UX friction, qualification mismatch, technical failure, content gap), the proposed experiments to test each cause, the named owner. Funnel analyses that report the rates without hypothesized causes produce no product decisions.

25. Causal inference design (DiD, IV, regression discontinuity)

Question: [paste, causal in nature]. Available data: [paste]. Random assignment available: [paste yes or no]. Draft a 600-word causal design memo: the recommended method (RCT if possible, otherwise difference-in-differences, instrumental variables, regression discontinuity, propensity-score matching, with the rationale per choice), the identifying assumptions, the falsifiability tests (parallel trends test for DiD, first-stage F-statistic for IV, density test for RDD), the sensitivity analyses, the limitation paragraph. Causal claims made without explicit identifying assumptions are correlation dressed up.

26. Time-series decomposition for a metric

Metric: [paste history, ideally 2-plus years]. Suspected pattern: [paste, e.g. trend, seasonality, anomaly]. Draft a 500-word decomposition memo: the trend component with the named smoothing method, the seasonal component (weekly, monthly, annual cycles), the residual analysis (anomalies, structural breaks), the named drivers per component (product changes, marketing campaigns, external events), the implication for forecasting and for the question. Time-series analyses that report "the metric is up" without decomposition miss whether the change is trend, seasonality, or one-off.

27. Anomaly detection memo (real vs noise)

Anomaly observed: [paste metric, time, magnitude]. Baseline: [paste expected range]. Draft a 400-word anomaly memo: the magnitude vs typical variance (is this a 2-sigma or 5-sigma event), the candidate causes ranked by likelihood (instrumentation change, real product change, external event, data quality issue), the verification steps per cause (queries to run, logs to check), the recommended action (investigate further, alert, accept as noise), the named escalation. Anomalies that get investigated as bugs and turn out to be product successes are calibration failures; the reverse is worse.

28. Survival analysis for churn prediction

Cohort: [paste user definition]. Outcome event: [paste churn definition]. Available covariates: [paste]. Draft a 500-word survival analysis memo: the time-to-event definition, the censoring rules (users who have not churned yet), the recommended model (Kaplan-Meier for description, Cox proportional hazards for covariates, accelerated failure time for predictions), the named assumptions (proportional hazards if Cox), the validation approach, the interpretation for product or CS. Churn predictions made with logistic regression on a fixed window miss the time-varying nature of risk.

29. Pre/post analysis with seasonality control

Intervention: [paste, e.g. product change, pricing change, policy change]. Pre-period and post-period: [paste]. Metric: [paste]. Draft a 500-word pre/post memo: the seasonality control strategy (year-over-year comparison, control group if available, time-series regression with seasonal dummies), the confound check (other changes that happened in the post-period), the effect size estimate with the confidence interval, the falsifiability test (placebo period before the intervention), the limitation paragraph. Pre/post analyses without seasonality control produce false positives in the busy season.

30. Cluster analysis with named clusters and validation

Population: [paste]. Features for clustering: [paste]. Draft a 600-word cluster analysis memo: the feature scaling decision (standardize, robust scale, none), the algorithm choice (k-means, hierarchical, DBSCAN, Gaussian mixture, with the rationale), the cluster count selection (elbow, silhouette, business interpretability), the named clusters with profiles, the stability validation (clusters reproduce on a holdout sample), the named business implication per cluster. Clusters presented without stability validation are often noise; this prompt forces the check.

Category 04

Visualization and Dashboards

Ten prompts for the chart and dashboard work where good analysis goes to die or compound. The shape: chart encoding matched to the question, dashboards designed for one named decision, color and accessibility audited, annotations explaining why a point matters, single-number dashboards where appropriate. Reject the "more charts means more insight" framing that produces dashboards no one reads.

Pairs with: Workhorse Pack

31. Chart selection memo (which chart for which question)

Question: [paste]. Data shape: [paste, e.g. one categorical and one continuous variable, time-series of a metric, distribution of a quantity]. Draft a 300-word chart selection memo: the question type classified (comparison, trend, distribution, correlation, composition, ranking), the recommended chart for the question type (with the rationale), the rejected alternatives (with the named reason), the encoding decisions (axis, color, size, position), the readability check (will the audience read this in 5 seconds). Pie charts for more than 4 slices, dual-axis charts without explicit need, and 3D charts in any case are usually wrong.

32. Dashboard design for an exec audience

Audience: [paste exec roles]. Decisions the dashboard enables: [paste named]. Draft a 500-word exec dashboard design memo: the top-level number (the one number that answers the most important question), the supporting cuts (segment, time, comparison vs plan), the alerting tiles (anomalies needing attention), the explicit non-goals (analyses the exec should not run from this dashboard), the cadence the dashboard supports, the named owner. Exec dashboards that try to answer every question end up answering none.

33. Dashboard design for an operational team

Team: [paste, e.g. support, CS, ops, sales]. Daily decisions: [paste]. Draft a 500-word ops dashboard design memo: the named real-time tiles (status, queue depth, SLA tracking), the daily-cadence tiles, the alerting thresholds with the named owner for each alert, the drill-down paths (from summary to detail), the data freshness SLA, the named owner. Operational dashboards refreshed daily for a team that needs hourly data fail silently for weeks.

34. Dashboard rationalization (which to kill)

Current dashboards: [paste list with owner and last-viewed dates]. Draft a 400-word dashboard rationalization memo: the dashboards with active use (keep, with the named decision they support), the dashboards with no active use (kill or archive), the dashboards with overlapping content (consolidate into one with the named owner), the metric ownership (every dashboard has one owner and one named decision), the maintenance cadence. Dashboard sprawl produces metric fatigue and conflicting numbers for the same question.

35. Color and accessibility audit for charts

Chart or dashboard: [paste description]. Current color choices: [paste]. Draft a 300-word accessibility memo: the color blindness check (passes Deuteranopia, Protanopia, Tritanopia simulation), the contrast ratio check (text vs background), the redundant encoding (color plus shape or label, not color alone), the print-friendly check (grayscale conversion still readable), the cultural-color considerations for international audiences. Charts that fail color-blindness checks fail roughly 8% of the audience silently.

36. Annotation discipline for charts

Chart: [paste]. The point that matters: [paste]. Draft a 300-word annotation memo: the named point or region the annotation calls out, the brief text (under 12 words) explaining why this point matters, the visual treatment (arrow, highlighted region, callout box), the placement to avoid covering other data, the omission rule (most charts do not need annotation, the rule for when they do). Annotations that explain what the viewer can already see add no value; annotations that explain why a point matters do.

37. Comparison chart design (vs plan, prior period, benchmark)

Metric: [paste]. Comparison: [paste vs plan, vs prior period, vs benchmark]. Draft a 400-word comparison chart memo: the chart type (bullet chart, side-by-side bar, slope chart, with rationale), the named reference (the explicit comparison line or value), the variance display (absolute, percent, or both), the time granularity, the threshold for highlighting (when does the variance become notable), the call-out for the actionable variance. Comparison charts without explicit reference values force the viewer to compute the variance mentally.

38. Funnel visualization with named drop-offs

Funnel: [paste stages and conversion rates]. Comparison: [paste, e.g. vs prior period, vs segment]. Draft a 300-word funnel visualization memo: the chart type (horizontal funnel, vertical bar, step chart, with rationale), the rate display per step (absolute count, percent of prior step, percent of total), the highlighting for the biggest drop-off, the segment overlay if multiple funnels are compared, the annotation for the most important transition. Funnels visualized as cone-shaped objects without numbers force interpretation; visualized with explicit rates support decision.

39. Cohort retention heatmap design

Cohorts: [paste cohort definitions and time periods]. Retention metric: [paste definition]. Draft a 300-word heatmap design memo: the axes (rows as cohorts, columns as time-since-acquisition, values as retention rate), the color scale (sequential for retention, diverging if comparing to a benchmark), the highlight rule (cohorts that materially deviate from the average), the call-out for the actionable pattern (cohorts that retain materially better or worse), the cadence of update. Cohort heatmaps that show only counts without retention rates miss the cohort-aging signal.

40. Single-number dashboard for one critical metric

Metric: [paste]. Audience: [paste]. Decision the metric supports: [paste]. Draft a 300-word single-number design memo: the headline number with the precision rule (round to the decision-relevant precision), the trend indicator (vs prior period with explicit direction), the threshold or target (the value that triggers attention), the supporting context (no more than 2 lines, the why behind the number), the drill-down link (one click to the detail). Single-number dashboards built for one decision compound; multi-tile dashboards trying to be single-number fail.

A good analysis names the cohort. A good chart names the comparison. A good memo puts the answer in the first sentence. The job is notebook work. The threads about the job are not.PromptLeadz Data Analyst Pack, Section 5

Category 05

Stakeholder Communication and Influence

Ten prompts for the translation work that determines whether the analyst influences decisions or produces shelfware. The shape: kickoffs that lock the question before the SQL runs, findings memos with the answer in the first sentence, pushback memos that decline a bad ask, model cards that document assumptions, exec readouts that survive the CFO. Reject the "let me build a deck" framing that buries the answer in 30 slides.

Pairs with: Operator Pack

41. Stakeholder kickoff for a new analytics request

Stakeholder: [paste role, prior context]. Ask: [paste original request]. Draft a 400-word kickoff agenda: the open questions to clarify the request (the decision, the deadline, the precision needed), the data availability check before committing, the explicit scope (what is in, what is out), the timeline estimate with the named confidence, the deliverable format (memo, dashboard, presentation), the next checkpoint. Analytics kickoffs without explicit decision and deadline produce open-ended investigations.

42. Findings memo with the answer in the first sentence

Analysis complete. Question: [paste]. Headline finding: [paste]. Supporting evidence: [paste]. Draft a 500-word findings memo: the answer in the first sentence (specific, named, no preamble), the supporting evidence in three to five points with the named effect size and confidence, the limitations paragraph (what the data could not answer, what assumptions could change the answer), the recommended next action with named owner, the appendix link to the notebook. Findings memos that lead with methodology lose the reader before the answer.

43. Pushback memo for a request the data cannot answer

Ask: [paste]. Why the data cannot answer: [paste, e.g. instrumentation gap, sample size, identification problem, fundamental ambiguity]. Draft a 400-word pushback memo: the named reason the data cannot answer the question as asked, the alternative question the data CAN answer, the path to enable the original question (instrumentation, experiment, sample, time), the realistic timeline, the recommended next step. Pushback delivered as a yes-with-caveats produces analyses that get cited as if they answered the original question.

44. Model card for a deployed model

Model: [paste type, version, deployment context]. Draft a 600-word model card: the intended use (the specific decisions the model supports), the training data (sources, time period, known biases), the metrics (primary and guardrail, with the holdout performance), the named failure modes (the inputs or contexts where the model performs worse), the monitoring approach (data drift, performance drift, alerting), the retraining cadence, the named owner. Deployed models without model cards become opaque artifacts the team is afraid to touch.

45. Explaining statistical uncertainty to a non-technical audience

Finding: [paste with effect size and confidence interval]. Audience: [paste]. Draft a 300-word uncertainty memo: the headline finding stated in plain terms, the named uncertainty (typically as a range, not a p-value), the analogy that conveys the uncertainty (the polling-margin-of-error analogy works for many audiences), the implication for the decision (what the uncertainty means for the choice), the additional data that would reduce the uncertainty. Statistical uncertainty explained as p-values fails non-statistical audiences; explained as ranges and analogies it lands.

46. Translating findings into product recommendations

Finding: [paste]. Product context: [paste]. Draft a 500-word product recommendation memo: the named recommendation (specific to a feature, surface, or workflow), the evidence linking the finding to the recommendation, the named alternative recommendations considered (with rationale for rejection), the implementation considerations (engineering effort, design implications, risks), the success metric for the recommendation, the named owner. Findings without explicit recommendations produce decks that get acknowledged and not acted on.

47. Quarterly analytics roadmap with named decisions enabled

Quarter: [paste]. Stakeholder asks: [paste]. Draft a 600-word analytics roadmap memo: the top three analyses for the quarter (each tied to a named decision), the explicit deprioritization (the asks we are not taking, with the named reason), the dependencies (instrumentation, dbt models, infrastructure), the cadence of stakeholder updates, the kill criteria per analysis (when to stop investing). Analytics roadmaps written as a list of asks produce a team that produces dashboards no one reads.

48. Exec readout from a complex analysis

Analysis: [paste topic and key findings]. Exec audience: [paste roles and stated priorities]. Draft a 500-word exec readout memo: the headline in one sentence, the three supporting points with the named effect size, the implication for strategy or operations, the named risks or limitations, the asks of the exec (decision, resource, follow-up), the next checkpoint. Exec readouts that lead with methodology before the answer lose the room.

49. Defending the methodology under questioning

Pushback received: [paste, e.g. "sample is too small", "the cohort is biased", "the effect could be reverse causality", "this is correlation not causation"]. Draft a 400-word defense memo: the named pushback restated factually, the response (yes-and where the pushback is partially valid, the additional analysis that addresses it, the acknowledgment when the pushback is fundamentally limiting), the alternative analyses already considered, the residual uncertainty stated honestly. Methodology defenses that dismiss valid pushback erode the analyst's credibility long-term.

50. Saying no to a vanity dashboard request

Request: [paste, e.g. "can you build a dashboard for X" where X is not tied to a decision]. Draft a 300-word polite decline memo: the named reason the dashboard would not be used (no decision tied to it, no clear owner, overlaps with existing dashboards), the alternative offered (a one-time analysis, a simpler chart, a link to existing dashboard), the request criteria for future dashboard asks (named decision, named owner, named cadence), the cadence of dashboard portfolio review. Vanity dashboards built without a decision become technical debt that ages into incidents nobody can debug.

How the prompts fit a real analyst week and quarter

Daily: SQL review on PRs, anomaly check on key metrics, kickoff a new ask if one comes in. The daily discipline is what keeps the data layer correct.

Weekly: findings memo on the analysis closed this week, exec readout for the analytics partner, dashboard health check, dbt test failure review. The weekly cadence is where the work gets shipped.

Monthly: dashboard rationalization audit, cohort analysis refresh for top metrics, model performance review for deployed models, data quality audit for top tables. The monthly cadence catches drift before it becomes quarterly debt.

Quarterly: analytics roadmap with named decisions, deprecated dashboard review, methodology retrospective, stakeholder satisfaction check. The quarterly cadence is where the analytics function justifies its existence.

Annually: metric definition refresh, model retraining and revalidation, dbt project audit, stakeholder portfolio rebalancing. The annual cadence is where the foundation gets reinforced.

Five mistakes that wreck analyst prompts

1. Filling in the prompt with vibes instead of specific cohorts, time windows, and effect sizes. The prompts ask for the named cohort, the explicit time window, the minimum detectable effect, the specific dollar or percent threshold. Filling with "recently", "engaged users", "big impact" produces output of the same low calibration that the senior analyst will reject.

2. Treating the output as the final memo. The prompts produce drafts. The actual memo is the draft after the SQL has been re-run, the numbers have been verified against the system of record, the methodology has been peer-reviewed, and the LLM-cliche phrasing has been edited out.

3. Skipping the prompts that ask uncomfortable questions. The pushback memo, the kill criterion memo, the confounder identification, the methodology defense. The avoided prompts are usually the ones with the most leverage. Notice the avoidance.

4. Sharing the LLM draft externally without redaction. The prompts produce internal artifacts naming specific cohorts, dollar values, customer behavior, and methodological assumptions. The outputs should not leave the analytics organization without explicit review.

5. Running the data-influencer prompts instead of these. Prompts that produce "actionable insights" content reinforce the genre this pack rejects. Calibration to the LinkedIn-thread voice produces threads, not findings memos.

Sources and further reading

The pack draws on a body of published work from senior practitioners. Recommended reading for analysts and data scientists who want depth beyond the threads.

Edward Tufte's work on data visualization, particularly The Visual Display of Quantitative Information and Envisioning Information, remains the canonical reference for chart encoding, annotation discipline, and the rejection of chartjunk.

Andrew Gelman's writing at statmodeling.stat.columbia.edu is the most rigorous contemporary blog on statistical methodology, p-hacking, and the credibility crisis in applied research.

Hadley Wickham's writing on tidy data and analytical workflow remains the foundation for modern analytical practice, especially the R for Data Science book co-authored with Garrett Grolemund and Mine Cetinkaya-Rundel.

Cassie Kozyrkov's writing on decision intelligence, originally as Chief Decision Scientist at Google, frames the analyst-to-decision-maker translation problem more clearly than any other contemporary source.

About PromptLeadz

PromptLeadz publishes free component-built prompt packs and the production-grade Drop-in utilities that wrap them. The franchise covers role-based packs (PM, EM, CSM, Sales Leader, Operator, Data Analyst, VC), format-based packs (.md agent files in breadth and depth), and the underlying frameworks (the 8-Component Skeleton, the Anti-Prompt-Engineering Manifesto).

Every pack rejects the LinkedIn-influencer voice at the prompt level by banning the genre's signature phrases inline. The result is output calibrated for memos that survive peer review, not threads that go viral. Free packs ship with no email gate at promptleadz.com.

Questions people ask

Who is this data analyst prompt pack for?

Data analysts, senior data analysts, data scientists, analytics engineers, and ML engineers acting in an analytical capacity. Most useful for analysts at B2B and consumer software companies working with engineering teams of 5 to 50 and product or commercial stakeholders. The prompts assume basic analytical literacy: SQL, statistical inference, A/B testing, cohort analysis, dashboard design.

Does it work for early-career analysts and senior data scientists?

Yes for both, with calibration. Early-career analysts lean on the hypothesis framing, SQL review, chart selection, and findings memo prompts (the foundational discipline). Senior data scientists lean on the causal inference, model card, methodology defense, and roadmap prompts (the leadership work). The cohort analysis and pre-registration prompts work across both.

Why does the pack ban phrases like data-driven and actionable insights?

Both phrases are legitimate ideas reduced to LinkedIn cliches. The pack bans the cliche framing because it produces low-calibration output that performs analytical rigor rather than naming the specific cohort, the specific effect size, or the specific decision the analysis informs. Real analysts talk about specific cohorts and confidence intervals; the cliche genre talks about being data-driven in general.

What output format do the prompts produce?

Notebook-memo register: flat, factual, named cohorts, specific effect sizes, explicit limitations. The opposite of LinkedIn-thread register. The prompts are calibrated for internal use: pre-registered analysis plans, SQL review memos, A/B test designs, model cards, findings memos that hold up under peer review and exec scrutiny.

How does this pair with other PromptLeadz packs?

Pairs with the 8-Component Skeleton framework as the foundation, the PM Pack for the cross-functional analytics-to-product work, the EM Pack for the data infrastructure work with engineering, and the Operator Pack for the finance and ops dashboards that get reviewed by the CFO.

Are these prompts safe to share with my team?

The prompts themselves are free to share. The outputs of the cohort analysis, A/B test, model card, methodology defense, and findings prompts are confidential and should not leave the analytics organization without explicit review. Stakeholder-facing memos are designed for redaction before external sharing.

Do these prompts work with Claude, ChatGPT, and Gemini?

Yes for all three. The prompts are built on the 8-component skeleton which works across frontier models. Claude tends to handle the methodology defense and causal inference prompts most naturally; ChatGPT requires slightly tighter constraints on effect-size formatting; Gemini works best when the output format is named explicitly with section headers.

What is the difference between a data analyst and a data scientist in this context?

A data analyst is responsible for producing findings memos, dashboards, and analytical artifacts that inform business decisions, typically using SQL, statistical tools, and visualization software. A data scientist is responsible for similar artifacts plus building models, designing causal inference studies, and deploying ML systems. The pack is calibrated for both; data analysts will lean on the hypothesis, SQL, visualization, and stakeholder categories; data scientists will additionally lean on the causal inference, model card, and methodology prompts.

The franchise: free packs, frameworks, and the manifesto

The thesis: The Anti-Prompt-Engineering Manifesto. The framework: The 8-Component Skeleton.

The production-grade versions

The free pack is the proof. The Drop-ins are the production-grade utilities that wrap evaluation, voice calibration, and output discipline around prompts. The bundle saves $191 against individual purchases.

All Ten Drop-ins Bundle - $489 The Sycophancy Killer - $79 The Workslop Filter - $49

Free packs, no email gate · Calibrated for 2026 frontier models · promptleadz.com

Free 50 Data Analyst Prompt Pack 2026: For the Analyst, Not the Influencer

Data workisnotebook work.