What is Data Analytics?

Data Analytics is the process of collecting, cleaning, analyzing, and visualizing data to drive smart business decisions.

In 2026, every company — from early-stage startups to Fortune 500 giants — needs data analysts to make sense of their data. This isn’t a trend. It’s the new normal.

Right now is the best time to enter this field:

  • 1.5 lakh+ open Data Analyst roles in India (LinkedIn/Naukri, 2026)
  • Fresher salary: ₹4–6 LPA | Mid-level: ₹10–16 LPA | US average: $95,000/year
  • No CS degree required — non-tech candidates are actively preferred

Who Should Become a Data Analyst?

You are the right fit if you are:

Good news — you don’t need to code to get started. You just need the right roadmap.


End-to-End Data Analyst Roadmap 2026

Phase 1 — Foundations

Topic What to Learn Why It Matters
Excel / Google Sheets Formulas, Pivot Tables, VLOOKUP, Charts Every analyst uses this daily
Statistics Basics Mean, Median, Std Dev, Probability Core for interpreting data
Data Types Structured vs Unstructured, Quantitative vs Qualitative Know what data you’re working with
Business Metrics KPIs, Revenue, Churn, Conversion Rate, DAU/MAU Speak the language of business

Phase 2 — SQL: The #1 Skill Tested in Interviews

SQL is asked in 95% of Data Analyst interviews. No SQL = No job.

Topic What to Learn
SQL Basics SELECT, WHERE, ORDER BY, LIMIT, DISTINCT
Aggregations GROUP BY, HAVING, COUNT, SUM, AVG, MIN, MAX
Joins INNER, LEFT, RIGHT, FULL OUTER, SELF, CROSS
Subqueries Nested, Correlated, EXISTS
Window Functions ROW_NUMBER, RANK, DENSE_RANK, LAG, LEAD, NTILE
CTEs WITH clause for readable, clean queries
Performance EXPLAIN, Indexing basics

Phase 3 — Python for Data Analysis

Library What to Learn Use Case
NumPy Arrays, numerical operations Fast math on large datasets
Pandas DataFrames, merge, groupby, pivot 80% of your daily Python work
Matplotlib Line, bar, pie charts Quick visualization
Seaborn Heatmaps, distribution plots Statistical visualization
Jupyter Notebook Writing + documenting code Industry-standard tool

Must-know Pandas operations: .fillna(), .dropna(), .merge(), .groupby(), .agg(), .loc[], .pivot_table()


Phase 4 — Data Visualization

Tool Best For Why Learn It
Power BI Business dashboards Most-used in Indian companies
Tableau Interactive visual analytics Most-used in MNCs and product companies
Looker Studio Free cloud reporting Great for freelancers and digital marketers

Build at least 3 dashboards — Sales, HR, and one personal project.


Phase 5 — Real Projects + Portfolio

Project 1: Sales Performance Dashboard Excel + Power BI | Dataset: Superstore Sales (Kaggle)

Project 2: E-Commerce Customer Analysis SQL + Python + Tableau | Dataset: Brazilian E-Commerce (Kaggle)

Project 3: HR Attrition Analysis Python (Pandas + Seaborn) + Power BI | Dataset: IBM HR Analytics (Kaggle)

Upload all projects to GitHub with a clear README. Post findings on LinkedIn with screenshots — recruiters notice this.


Phase 6 — Interview Preparation


20-Week Study Plan

Week Topic Daily Hours Goal
1 Excel Basics 2 hrs Formulas, pivot tables
2 Excel Advanced + Statistics 2 hrs VLOOKUP, stats fundamentals
3 SQL Basics 2–3 hrs SELECT, WHERE, GROUP BY
4 SQL Intermediate 2–3 hrs Subqueries, aggregations
5 SQL Joins 2–3 hrs All 6 join types with practice
6 SQL Window Functions 2–3 hrs RANK, LAG, LEAD, running totals
7 SQL Practice + CTEs 3 hrs 30+ LeetCode/HackerRank problems
8 Python NumPy + Pandas 2–3 hrs Read, filter, clean data
9 Python Pandas Advanced 2–3 hrs Merge, groupby, EDA workflow
10 Python Visualization 2 hrs Matplotlib + Seaborn
11 Power BI Basics 2 hrs First report, data connections
12 Power BI + DAX 2–3 hrs Measures, calculated columns
13 Tableau Basics 2 hrs Charts, filters, dashboards
14 Project 1: Sales Dashboard 3 hrs Full end-to-end project
15 Project 2: E-Commerce Analysis 3 hrs SQL + Python + Tableau
16 Project 3: HR Attrition 3 hrs Python + Power BI
17 Portfolio + GitHub + LinkedIn 2 hrs Upload projects, write case studies
18 Resume + Job Applications 2 hrs Apply to 20+ jobs/week
19 SQL Interview Prep 3 hrs Practice all 30 questions
20 Mock Interviews + Case Studies 3 hrs Full round simulations

Tools Every Data Analyst Must Know in 2026

Category Tool Free? Priority
Spreadsheets Excel / Google Sheets ✅ Free 🔴 Must
Database MySQL / PostgreSQL ✅ Free 🔴 Must
Programming Python (Anaconda) ✅ Free 🔴 Must
BI Dashboard Power BI Desktop ✅ Free 🔴 Must
BI Dashboard Tableau Public ✅ Free 🟡 High
Notebooks Jupyter / Google Colab ✅ Free 🔴 Must
Version Control GitHub ✅ Free 🟡 High
SQL Practice LeetCode / StrataScratch ✅ Free tier 🔴 Must
Cloud (Bonus) Google BigQuery ✅ Free tier 🟢 Good to have

Data Analyst Salary in India (2026)

Experience Level Average Salary
Fresher (0–1 year) ₹3.5 – ₹6 LPA
Junior Analyst (1–3 years) ₹6 – ₹10 LPA
Mid-Level (3–5 years) ₹10 – ₹16 LPA
Senior Analyst (5+ years) ₹16 – ₹30 LPA

Top Companies Hiring Data Analysts in 2026

Career Progression: Data Analyst → Senior Data Analyst → Analytics Manager → Head of Analytics → VP of Data / Chief Data Officer


Top 30 SQL Interview Questions With Answers (2026)

Asked at Amazon, Flipkart, Swiggy, Paytm, TCS, Accenture, Deloitte, and hundreds of startups.


Section A: SQL Joins (Q1–Q10)

Q1. What are the different types of JOINs in SQL?

Join Type What It Returns
INNER JOIN Only matching rows from both tables
LEFT JOIN All rows from left table + matched rows from right (NULLs for no match)
RIGHT JOIN All rows from right table + matched rows from left
FULL OUTER JOIN All rows from both tables (NULLs where no match exists)
SELF JOIN A table joined with itself
CROSS JOIN Every row of Table A × every row of Table B (Cartesian product)

Q2. Get all employees and their department names, including employees without a department.

sql
SELECT 
    e.employee_id,
    e.employee_name,
    d.department_name
FROM employees e
LEFT JOIN departments d 
    ON e.department_id = d.department_id;

Why: LEFT JOIN ensures employees without a department still appear, with NULL for department_name.


Q3. Find employees who have NO matching department (orphan records).

sql
SELECT 
    e.employee_id,
    e.employee_name
FROM employees e
LEFT JOIN departments d 
    ON e.department_id = d.department_id
WHERE d.department_id IS NULL;

Q4. Write a SELF JOIN to show each employee and their manager.

sql
SELECT 
    e.employee_name AS Employee,
    m.employee_name AS Manager
FROM employees e
LEFT JOIN employees m 
    ON e.manager_id = m.employee_id;

Why: LEFT JOIN ensures employees without a manager (like the CEO) still appear.


Q5. What is the difference between INNER JOIN and LEFT JOIN?

Example: If employees has 100 rows, but only 80 have a matching department:


Q6. Find records that exist in both tables.

sql
SELECT e.employee_id, e.employee_name
FROM employees e
INNER JOIN contractors c 
    ON e.employee_id = c.contractor_id;

Q7. Get all combinations of products and categories using CROSS JOIN.

sql
SELECT 
    p.product_name,
    c.category_name
FROM products p
CROSS JOIN categories c;

If products has 5 rows and categories has 4 rows → 5 × 4 = 20 rows. Used for generating combinations or test data.


Q8. Write a FULL OUTER JOIN to find unmatched records from both tables.

sql
SELECT 
    e.employee_name,
    d.department_name
FROM employees e
FULL OUTER JOIN departments d 
    ON e.department_id = d.department_id
WHERE e.employee_id IS NULL 
   OR d.department_id IS NULL;

⚠️ MySQL does not support FULL OUTER JOIN. Use Q9’s approach instead.


Q9. How do you simulate FULL OUTER JOIN in MySQL?

sql
SELECT e.employee_name, d.department_name
FROM employees e
LEFT JOIN departments d ON e.department_id = d.department_id
UNION
SELECT e.employee_name, d.department_name
FROM employees e
RIGHT JOIN departments d ON e.department_id = d.department_id;

Q10. Use SELF JOIN to find employees who share the same salary.

sql
SELECT 
    a.employee_name AS Employee1,
    b.employee_name AS Employee2,
    a.salary
FROM employees a
JOIN employees b 
    ON a.salary = b.salary
    AND a.employee_id <> b.employee_id;

Section B: Window Functions (Q11–Q22)

Q11. What are Window Functions in SQL?

Window Functions perform calculations across a set of related rows without collapsing the result like GROUP BY does.

Function Purpose
ROW_NUMBER() Unique sequential number per row
RANK() Tied rows get same rank; next rank has a gap
DENSE_RANK() Tied rows get same rank; no gap after
NTILE(n) Divides rows into n equal buckets
LAG(col, n) Value from n rows before current row
LEAD(col, n) Value from n rows after current row
SUM() OVER() Running or cumulative totals
AVG() OVER() Moving averages
sql
function_name() OVER (
    PARTITION BY column     -- group within (optional)
    ORDER BY column         -- sort within the window
    ROWS BETWEEN ... AND ... -- define row range (optional)
)

Q12. Rank employees by salary within each department.

sql
SELECT 
    employee_name,
    department_id,
    salary,
    RANK() OVER (
        PARTITION BY department_id 
        ORDER BY salary DESC
    ) AS salary_rank
FROM employees;

Q13. Difference between RANK(), DENSE_RANK(), and ROW_NUMBER()?

Given salaries: 90000, 90000, 85000

Function Result
ROW_NUMBER() 1, 2, 3 — always unique
RANK() 1, 1, 3 — tied rows same rank, next rank skips
DENSE_RANK() 1, 1, 2 — tied rows same rank, no gap

Interview Tip: Use DENSE_RANK for “Nth highest” queries. Use ROW_NUMBER for deduplication.


Q14. Find the top 3 highest-paid employees in each department.

sql
WITH ranked AS (
    SELECT 
        employee_name,
        department_id,
        salary,
        DENSE_RANK() OVER (
            PARTITION BY department_id 
            ORDER BY salary DESC
        ) AS rnk
    FROM employees
)
SELECT * FROM ranked WHERE rnk <= 3;

You cannot use WHERE directly on a window function result — always wrap it in a CTE first.


Q15. Calculate a running total of sales by date.

sql
SELECT 
    sale_date,
    amount,
    SUM(amount) OVER (ORDER BY sale_date) AS running_total
FROM sales;

Q16. Calculate running total of sales within each region separately.

sql
SELECT 
    region,
    sale_date,
    amount,
    SUM(amount) OVER (
        PARTITION BY region 
        ORDER BY sale_date
    ) AS running_total
FROM sales;

Q17. Use LAG() to find month-over-month revenue change.

sql
SELECT 
    month,
    revenue,
    LAG(revenue, 1) OVER (ORDER BY month) AS prev_month_revenue,
    revenue - LAG(revenue, 1) OVER (ORDER BY month) AS revenue_change
FROM monthly_sales;

Q18. Use LEAD() to show the next order date for each customer.

sql
SELECT 
    customer_id,
    order_date,
    LEAD(order_date, 1) OVER (
        PARTITION BY customer_id 
        ORDER BY order_date
    ) AS next_order_date
FROM orders;

Q19. Divide employees into 4 salary quartiles using NTILE().

sql
SELECT 
    employee_name,
    salary,
    NTILE(4) OVER (ORDER BY salary DESC) AS quartile
FROM employees;

Quartile 1 = top 25% earners. Used for performance tiers and segmentation.


Q20. Calculate a 3-month moving average of sales.

sql
SELECT 
    month,
    revenue,
    AVG(revenue) OVER (
        ORDER BY month
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) AS moving_avg_3month
FROM monthly_sales;

Q21. Remove duplicate rows and keep only the first occurrence.

sql
WITH deduped AS (
    SELECT *,
        ROW_NUMBER() OVER (
            PARTITION BY email 
            ORDER BY created_at ASC
        ) AS rn
    FROM users
)
SELECT * FROM deduped WHERE rn = 1;

Q22. Show what percentage of total sales each product contributes.

sql
SELECT 
    product_name,
    sales_amount,
    ROUND(
        100.0 * sales_amount / SUM(sales_amount) OVER (), 2
    ) AS pct_of_total
FROM product_sales
ORDER BY pct_of_total DESC;

Section C: General SQL Questions (Q23–Q30)

Q23. Difference between WHERE and HAVING?

Clause Applied Filters
WHERE Before GROUP BY Individual rows
HAVING After GROUP BY Aggregated groups

Rule: If filtering on an aggregate (COUNT, SUM, AVG) → use HAVING. Otherwise → use WHERE.


Q24. Find the second-highest salary.

sql
WITH ranked AS (
    SELECT salary,
           DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk
    FROM employees
)
SELECT salary FROM ranked WHERE rnk = 2;

Change rnk = 2 to any number for the Nth highest salary.


Q25. What is a CTE? Write an example.

A CTE (Common Table Expression) is a temporary named result set that makes complex queries easier to read.

sql
WITH high_earners AS (
    SELECT employee_name, salary, department_id
    FROM employees
    WHERE salary > 80000
)
SELECT * FROM high_earners;

Q26. Difference between DELETE, TRUNCATE, and DROP?

Command What It Does Can Rollback? WHERE Clause?
DELETE Removes specific rows ✅ Yes ✅ Yes
TRUNCATE Removes all rows, keeps structure ❌ No ❌ No
DROP Removes entire table ❌ No ❌ No

Q27. Find departments where average salary exceeds the company average.

sql
SELECT department_id, AVG(salary) AS dept_avg_salary
FROM employees
GROUP BY department_id
HAVING AVG(salary) > (SELECT AVG(salary) FROM employees);

Q28. Find customers who placed orders in both 2024 and 2025.

sql
-- MySQL version
SELECT DISTINCT customer_id
FROM orders
WHERE YEAR(order_date) = 2024
  AND customer_id IN (
      SELECT customer_id 
      FROM orders 
      WHERE YEAR(order_date) = 2025
  );

Q29. What are SQL Indexes?

An index speeds up data retrieval — like a book’s index pointing to the right page.

sql
CREATE INDEX idx_email ON employees(email);
EXPLAIN SELECT * FROM employees WHERE email = 'user@example.com';

Indexes help on columns used in WHERE, JOIN ON, and ORDER BY. They slightly slow down INSERT/UPDATE/DELETE.


Q30. Find employees hired in the last 30 days along with their department.

sql
SELECT 
    e.employee_name,
    e.hire_date,
    d.department_name
FROM employees e
JOIN departments d ON e.department_id = d.department_id
WHERE e.hire_date >= CURRENT_DATE - INTERVAL 30 DAY
ORDER BY e.hire_date DESC;

7 Tips to Crack Your Data Analyst Interview

1. Master SQL First — It’s Non-Negotiable 95% of interviews test SQL. Focus on Joins, Window Functions, CTEs, and Subqueries. Practice on LeetCode, HackerRank, and StrataScratch.

2. Build 3 Real Projects — Not Just Certificates A strong portfolio beats a certification every time. Companies want proof you can analyze data and tell a story.

3. Go Deep in One Visualization Tool Pick Power BI or Tableau and master it. Building a dashboard from scratch is a major differentiator.

4. Practice Case Studies Companies ask: “Our user retention dropped 15% — how would you investigate?” Use this framework: Define → Metrics → Explore → Hypothesize → Recommend.

5. LinkedIn is Your Job Engine in 2026 Post your learning journey. Share project screenshots. Write short insights. Recruiters actively look for candidates who learn in public.

6. Apply Volume + Quality Together Send 20+ tailored applications per week. Use referrals where possible. Don’t wait to feel “ready” — apply while learning.

7. Don’t Skip Mock Interviews Practice SQL rounds, case study rounds, and behavioral rounds before the real thing. Interviews teach you what to focus on faster than anything else.


Conclusion

Becoming a Data Analyst in 2026 is 100% achievable — regardless of your background. The path is clear:

Excel → SQL → Python → Visualization → Projects → Interview Prep

The most important step? Start today.

Follow this roadmap, complete the projects, and practice every SQL question above. You will crack your Data Analyst interview in 2026.

Do the payment at the QR Code below

Submission Successful. Our Team will review it in 1 hour, and contact you.