What is Data Analytics?
Data Analytics is the process of collecting, cleaning, analyzing, and visualizing data to drive smart business decisions.
In 2026, every company — from early-stage startups to Fortune 500 giants — needs data analysts to make sense of their data. This isn’t a trend. It’s the new normal.
Right now is the best time to enter this field:
- 1.5 lakh+ open Data Analyst roles in India (LinkedIn/Naukri, 2026)
- Fresher salary: ₹4–6 LPA | Mid-level: ₹10–16 LPA | US average: $95,000/year
- No CS degree required — non-tech candidates are actively preferred
Who Should Become a Data Analyst?
You are the right fit if you are:
- From a B.Tech Commerce, Arts, Science, MBA, or BBA background
- A fresher looking for a high-paying first job
- A working professional wanting a salary jump
- Someone who loves numbers and business problems but never learned to code
Good news — you don’t need to code to get started. You just need the right roadmap.
End-to-End Data Analyst Roadmap 2026
Phase 1 — Foundations
| Topic | What to Learn | Why It Matters |
|---|---|---|
| Excel / Google Sheets | Formulas, Pivot Tables, VLOOKUP, Charts | Every analyst uses this daily |
| Statistics Basics | Mean, Median, Std Dev, Probability | Core for interpreting data |
| Data Types | Structured vs Unstructured, Quantitative vs Qualitative | Know what data you’re working with |
| Business Metrics | KPIs, Revenue, Churn, Conversion Rate, DAU/MAU | Speak the language of business |
Phase 2 — SQL: The #1 Skill Tested in Interviews
SQL is asked in 95% of Data Analyst interviews. No SQL = No job.
| Topic | What to Learn |
|---|---|
| SQL Basics | SELECT, WHERE, ORDER BY, LIMIT, DISTINCT |
| Aggregations | GROUP BY, HAVING, COUNT, SUM, AVG, MIN, MAX |
| Joins | INNER, LEFT, RIGHT, FULL OUTER, SELF, CROSS |
| Subqueries | Nested, Correlated, EXISTS |
| Window Functions | ROW_NUMBER, RANK, DENSE_RANK, LAG, LEAD, NTILE |
| CTEs | WITH clause for readable, clean queries |
| Performance | EXPLAIN, Indexing basics |
Phase 3 — Python for Data Analysis
| Library | What to Learn | Use Case |
|---|---|---|
| NumPy | Arrays, numerical operations | Fast math on large datasets |
| Pandas | DataFrames, merge, groupby, pivot | 80% of your daily Python work |
| Matplotlib | Line, bar, pie charts | Quick visualization |
| Seaborn | Heatmaps, distribution plots | Statistical visualization |
| Jupyter Notebook | Writing + documenting code | Industry-standard tool |
Must-know Pandas operations: .fillna(), .dropna(), .merge(), .groupby(), .agg(), .loc[], .pivot_table()
Phase 4 — Data Visualization
| Tool | Best For | Why Learn It |
|---|---|---|
| Power BI | Business dashboards | Most-used in Indian companies |
| Tableau | Interactive visual analytics | Most-used in MNCs and product companies |
| Looker Studio | Free cloud reporting | Great for freelancers and digital marketers |
Build at least 3 dashboards — Sales, HR, and one personal project.
Phase 5 — Real Projects + Portfolio
Project 1: Sales Performance Dashboard Excel + Power BI | Dataset: Superstore Sales (Kaggle)
Project 2: E-Commerce Customer Analysis SQL + Python + Tableau | Dataset: Brazilian E-Commerce (Kaggle)
Project 3: HR Attrition Analysis Python (Pandas + Seaborn) + Power BI | Dataset: IBM HR Analytics (Kaggle)
Upload all projects to GitHub with a clear README. Post findings on LinkedIn with screenshots — recruiters notice this.
Phase 6 — Interview Preparation
- Practice the 30 SQL questions below
- Learn A/B Testing basics
- Master the case study framework: Define → Metrics → Explore → Hypothesize → Recommend
- Write a resume with quantified outcomes (not just job duties)
20-Week Study Plan
| Week | Topic | Daily Hours | Goal |
|---|---|---|---|
| 1 | Excel Basics | 2 hrs | Formulas, pivot tables |
| 2 | Excel Advanced + Statistics | 2 hrs | VLOOKUP, stats fundamentals |
| 3 | SQL Basics | 2–3 hrs | SELECT, WHERE, GROUP BY |
| 4 | SQL Intermediate | 2–3 hrs | Subqueries, aggregations |
| 5 | SQL Joins | 2–3 hrs | All 6 join types with practice |
| 6 | SQL Window Functions | 2–3 hrs | RANK, LAG, LEAD, running totals |
| 7 | SQL Practice + CTEs | 3 hrs | 30+ LeetCode/HackerRank problems |
| 8 | Python NumPy + Pandas | 2–3 hrs | Read, filter, clean data |
| 9 | Python Pandas Advanced | 2–3 hrs | Merge, groupby, EDA workflow |
| 10 | Python Visualization | 2 hrs | Matplotlib + Seaborn |
| 11 | Power BI Basics | 2 hrs | First report, data connections |
| 12 | Power BI + DAX | 2–3 hrs | Measures, calculated columns |
| 13 | Tableau Basics | 2 hrs | Charts, filters, dashboards |
| 14 | Project 1: Sales Dashboard | 3 hrs | Full end-to-end project |
| 15 | Project 2: E-Commerce Analysis | 3 hrs | SQL + Python + Tableau |
| 16 | Project 3: HR Attrition | 3 hrs | Python + Power BI |
| 17 | Portfolio + GitHub + LinkedIn | 2 hrs | Upload projects, write case studies |
| 18 | Resume + Job Applications | 2 hrs | Apply to 20+ jobs/week |
| 19 | SQL Interview Prep | 3 hrs | Practice all 30 questions |
| 20 | Mock Interviews + Case Studies | 3 hrs | Full round simulations |
Tools Every Data Analyst Must Know in 2026
| Category | Tool | Free? | Priority |
|---|---|---|---|
| Spreadsheets | Excel / Google Sheets | ✅ Free | 🔴 Must |
| Database | MySQL / PostgreSQL | ✅ Free | 🔴 Must |
| Programming | Python (Anaconda) | ✅ Free | 🔴 Must |
| BI Dashboard | Power BI Desktop | ✅ Free | 🔴 Must |
| BI Dashboard | Tableau Public | ✅ Free | 🟡 High |
| Notebooks | Jupyter / Google Colab | ✅ Free | 🔴 Must |
| Version Control | GitHub | ✅ Free | 🟡 High |
| SQL Practice | LeetCode / StrataScratch | ✅ Free tier | 🔴 Must |
| Cloud (Bonus) | Google BigQuery | ✅ Free tier | 🟢 Good to have |
Data Analyst Salary in India (2026)
| Experience Level | Average Salary |
|---|---|
| Fresher (0–1 year) | ₹3.5 – ₹6 LPA |
| Junior Analyst (1–3 years) | ₹6 – ₹10 LPA |
| Mid-Level (3–5 years) | ₹10 – ₹16 LPA |
| Senior Analyst (5+ years) | ₹16 – ₹30 LPA |
Top Companies Hiring Data Analysts in 2026
- IT Giants: TCS, Infosys, Wipro, HCL, Accenture, Capgemini
- Product Companies: Flipkart, Amazon, Swiggy, Zomato, PhonePe, Meesho, Razorpay
- MNCs: Google, Microsoft, Deloitte, EY, PwC, McKinsey
- Startups: 1,000+ funded startups across India actively hiring
Career Progression: Data Analyst → Senior Data Analyst → Analytics Manager → Head of Analytics → VP of Data / Chief Data Officer
Top 30 SQL Interview Questions With Answers (2026)
Asked at Amazon, Flipkart, Swiggy, Paytm, TCS, Accenture, Deloitte, and hundreds of startups.
Section A: SQL Joins (Q1–Q10)
Q1. What are the different types of JOINs in SQL?
| Join Type | What It Returns |
|---|---|
| INNER JOIN | Only matching rows from both tables |
| LEFT JOIN | All rows from left table + matched rows from right (NULLs for no match) |
| RIGHT JOIN | All rows from right table + matched rows from left |
| FULL OUTER JOIN | All rows from both tables (NULLs where no match exists) |
| SELF JOIN | A table joined with itself |
| CROSS JOIN | Every row of Table A × every row of Table B (Cartesian product) |
Q2. Get all employees and their department names, including employees without a department.
SELECT
e.employee_id,
e.employee_name,
d.department_name
FROM employees e
LEFT JOIN departments d
ON e.department_id = d.department_id;
Why: LEFT JOIN ensures employees without a department still appear, with NULL for department_name.
Q3. Find employees who have NO matching department (orphan records).
SELECT
e.employee_id,
e.employee_name
FROM employees e
LEFT JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_id IS NULL;
Q4. Write a SELF JOIN to show each employee and their manager.
SELECT
e.employee_name AS Employee,
m.employee_name AS Manager
FROM employees e
LEFT JOIN employees m
ON e.manager_id = m.employee_id;
Why: LEFT JOIN ensures employees without a manager (like the CEO) still appear.
Q5. What is the difference between INNER JOIN and LEFT JOIN?
- INNER JOIN → Returns only rows where a match exists in both tables.
- LEFT JOIN → Returns all rows from the left table + matched rows from the right.
Example: If employees has 100 rows, but only 80 have a matching department:
- INNER JOIN → 80 rows
- LEFT JOIN → 100 rows (20 rows show NULL for department columns)
Q6. Find records that exist in both tables.
SELECT e.employee_id, e.employee_name
FROM employees e
INNER JOIN contractors c
ON e.employee_id = c.contractor_id;
Q7. Get all combinations of products and categories using CROSS JOIN.
SELECT
p.product_name,
c.category_name
FROM products p
CROSS JOIN categories c;
If products has 5 rows and categories has 4 rows → 5 × 4 = 20 rows. Used for generating combinations or test data.
Q8. Write a FULL OUTER JOIN to find unmatched records from both tables.
SELECT
e.employee_name,
d.department_name
FROM employees e
FULL OUTER JOIN departments d
ON e.department_id = d.department_id
WHERE e.employee_id IS NULL
OR d.department_id IS NULL;
⚠️ MySQL does not support FULL OUTER JOIN. Use Q9’s approach instead.
Q9. How do you simulate FULL OUTER JOIN in MySQL?
SELECT e.employee_name, d.department_name
FROM employees e
LEFT JOIN departments d ON e.department_id = d.department_id
UNION
SELECT e.employee_name, d.department_name
FROM employees e
RIGHT JOIN departments d ON e.department_id = d.department_id;
Q10. Use SELF JOIN to find employees who share the same salary.
SELECT
a.employee_name AS Employee1,
b.employee_name AS Employee2,
a.salary
FROM employees a
JOIN employees b
ON a.salary = b.salary
AND a.employee_id <> b.employee_id;
Section B: Window Functions (Q11–Q22)
Q11. What are Window Functions in SQL?
Window Functions perform calculations across a set of related rows without collapsing the result like GROUP BY does.
| Function | Purpose |
|---|---|
| ROW_NUMBER() | Unique sequential number per row |
| RANK() | Tied rows get same rank; next rank has a gap |
| DENSE_RANK() | Tied rows get same rank; no gap after |
| NTILE(n) | Divides rows into n equal buckets |
| LAG(col, n) | Value from n rows before current row |
| LEAD(col, n) | Value from n rows after current row |
| SUM() OVER() | Running or cumulative totals |
| AVG() OVER() | Moving averages |
function_name() OVER (
PARTITION BY column -- group within (optional)
ORDER BY column -- sort within the window
ROWS BETWEEN ... AND ... -- define row range (optional)
)
Q12. Rank employees by salary within each department.
SELECT
employee_name,
department_id,
salary,
RANK() OVER (
PARTITION BY department_id
ORDER BY salary DESC
) AS salary_rank
FROM employees;
Q13. Difference between RANK(), DENSE_RANK(), and ROW_NUMBER()?
Given salaries: 90000, 90000, 85000
| Function | Result |
|---|---|
| ROW_NUMBER() | 1, 2, 3 — always unique |
| RANK() | 1, 1, 3 — tied rows same rank, next rank skips |
| DENSE_RANK() | 1, 1, 2 — tied rows same rank, no gap |
Interview Tip: Use DENSE_RANK for “Nth highest” queries. Use ROW_NUMBER for deduplication.
Q14. Find the top 3 highest-paid employees in each department.
WITH ranked AS (
SELECT
employee_name,
department_id,
salary,
DENSE_RANK() OVER (
PARTITION BY department_id
ORDER BY salary DESC
) AS rnk
FROM employees
)
SELECT * FROM ranked WHERE rnk <= 3;
You cannot use WHERE directly on a window function result — always wrap it in a CTE first.
Q15. Calculate a running total of sales by date.
SELECT
sale_date,
amount,
SUM(amount) OVER (ORDER BY sale_date) AS running_total
FROM sales;
Q16. Calculate running total of sales within each region separately.
SELECT
region,
sale_date,
amount,
SUM(amount) OVER (
PARTITION BY region
ORDER BY sale_date
) AS running_total
FROM sales;
Q17. Use LAG() to find month-over-month revenue change.
SELECT
month,
revenue,
LAG(revenue, 1) OVER (ORDER BY month) AS prev_month_revenue,
revenue - LAG(revenue, 1) OVER (ORDER BY month) AS revenue_change
FROM monthly_sales;
Q18. Use LEAD() to show the next order date for each customer.
SELECT
customer_id,
order_date,
LEAD(order_date, 1) OVER (
PARTITION BY customer_id
ORDER BY order_date
) AS next_order_date
FROM orders;
Q19. Divide employees into 4 salary quartiles using NTILE().
SELECT
employee_name,
salary,
NTILE(4) OVER (ORDER BY salary DESC) AS quartile
FROM employees;
Quartile 1 = top 25% earners. Used for performance tiers and segmentation.
Q20. Calculate a 3-month moving average of sales.
SELECT
month,
revenue,
AVG(revenue) OVER (
ORDER BY month
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS moving_avg_3month
FROM monthly_sales;
Q21. Remove duplicate rows and keep only the first occurrence.
WITH deduped AS (
SELECT *,
ROW_NUMBER() OVER (
PARTITION BY email
ORDER BY created_at ASC
) AS rn
FROM users
)
SELECT * FROM deduped WHERE rn = 1;
Q22. Show what percentage of total sales each product contributes.
SELECT
product_name,
sales_amount,
ROUND(
100.0 * sales_amount / SUM(sales_amount) OVER (), 2
) AS pct_of_total
FROM product_sales
ORDER BY pct_of_total DESC;
Section C: General SQL Questions (Q23–Q30)
Q23. Difference between WHERE and HAVING?
| Clause | Applied | Filters |
|---|---|---|
| WHERE | Before GROUP BY | Individual rows |
| HAVING | After GROUP BY | Aggregated groups |
Rule: If filtering on an aggregate (COUNT, SUM, AVG) → use HAVING. Otherwise → use WHERE.
Q24. Find the second-highest salary.
WITH ranked AS (
SELECT salary,
DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk
FROM employees
)
SELECT salary FROM ranked WHERE rnk = 2;
Change
rnk = 2to any number for the Nth highest salary.
Q25. What is a CTE? Write an example.
A CTE (Common Table Expression) is a temporary named result set that makes complex queries easier to read.
WITH high_earners AS (
SELECT employee_name, salary, department_id
FROM employees
WHERE salary > 80000
)
SELECT * FROM high_earners;
Q26. Difference between DELETE, TRUNCATE, and DROP?
| Command | What It Does | Can Rollback? | WHERE Clause? |
|---|---|---|---|
| DELETE | Removes specific rows | ✅ Yes | ✅ Yes |
| TRUNCATE | Removes all rows, keeps structure | ❌ No | ❌ No |
| DROP | Removes entire table | ❌ No | ❌ No |
Q27. Find departments where average salary exceeds the company average.
SELECT department_id, AVG(salary) AS dept_avg_salary
FROM employees
GROUP BY department_id
HAVING AVG(salary) > (SELECT AVG(salary) FROM employees);
Q28. Find customers who placed orders in both 2024 and 2025.
-- MySQL version
SELECT DISTINCT customer_id
FROM orders
WHERE YEAR(order_date) = 2024
AND customer_id IN (
SELECT customer_id
FROM orders
WHERE YEAR(order_date) = 2025
);
Q29. What are SQL Indexes?
An index speeds up data retrieval — like a book’s index pointing to the right page.
CREATE INDEX idx_email ON employees(email);
EXPLAIN SELECT * FROM employees WHERE email = 'user@example.com';
Indexes help on columns used in WHERE, JOIN ON, and ORDER BY. They slightly slow down INSERT/UPDATE/DELETE.
Q30. Find employees hired in the last 30 days along with their department.
SELECT
e.employee_name,
e.hire_date,
d.department_name
FROM employees e
JOIN departments d ON e.department_id = d.department_id
WHERE e.hire_date >= CURRENT_DATE - INTERVAL 30 DAY
ORDER BY e.hire_date DESC;
7 Tips to Crack Your Data Analyst Interview
1. Master SQL First — It’s Non-Negotiable 95% of interviews test SQL. Focus on Joins, Window Functions, CTEs, and Subqueries. Practice on LeetCode, HackerRank, and StrataScratch.
2. Build 3 Real Projects — Not Just Certificates A strong portfolio beats a certification every time. Companies want proof you can analyze data and tell a story.
3. Go Deep in One Visualization Tool Pick Power BI or Tableau and master it. Building a dashboard from scratch is a major differentiator.
4. Practice Case Studies Companies ask: “Our user retention dropped 15% — how would you investigate?” Use this framework: Define → Metrics → Explore → Hypothesize → Recommend.
5. LinkedIn is Your Job Engine in 2026 Post your learning journey. Share project screenshots. Write short insights. Recruiters actively look for candidates who learn in public.
6. Apply Volume + Quality Together Send 20+ tailored applications per week. Use referrals where possible. Don’t wait to feel “ready” — apply while learning.
7. Don’t Skip Mock Interviews Practice SQL rounds, case study rounds, and behavioral rounds before the real thing. Interviews teach you what to focus on faster than anything else.
Conclusion
Becoming a Data Analyst in 2026 is 100% achievable — regardless of your background. The path is clear:
Excel → SQL → Python → Visualization → Projects → Interview Prep
The most important step? Start today.
Follow this roadmap, complete the projects, and practice every SQL question above. You will crack your Data Analyst interview in 2026.