BDAA

Java - JDK vs JRE

We need to know about Java. So, java has two things as follows: -

Java has JRE (Java Runtime Environment) for running Java applications and JDK (Java Development Kit) for developing them, which includes JRE, compiler, and development tools.

So, JDK = JRE + JDK, so always install JDK then.

How to Check which Java is installed and where?

We can check if JDK is installed by running this command in the command prompt (CMD or terminal):

for JRE version

java -version

for JDK version

javac -version

In order to know to run the commands, hold on windows button + R

then a black screen comes, then type above one command and enter then second command then enter, if you see response in both then JRE as well as JDK version both are installed.

In case you want to know where Java is installed like :C Drive or :D drive then

Using Command Prompt

Type this command in CMD and press Enter:

where java

or

where javac

This will show the exact path where Java (JDK) is installed.

Method 2: Using Environment Variables

In CMD, type:

echo %JAVA_HOME%

If JAVA_HOME is set, it will display the installation path.

Method 3: Manually Check in C Drive

Open File Explorer (Win + E).
Go to C:\Program Files\Java (for 64-bit Java).
Or check C:\Program Files (x86)\Java (for 32-bit Java).
Inside, you’ll find a folder like jdk-22.0.2, which is your JDK installation directory.

1. java.exe (Tea Cup Icon) ☕

This is the Java Runtime Environment (JRE) executable.
It runs Java applications.
The tea cup icon represents Java's branding.

2. javac.exe (Blue Rectangle with Lines) 📄

This is the Java Compiler (javac stands for "Java Compiler").
It compiles .java files into .class files (bytecode).
The blue rectangle with lines is the default Windows icon for executables that don’t have a custom icon.

Why No Tea Cup Icon on javac.exe?

Java assigns the tea cup icon only to java.exe since it is used to run Java programs.
javac.exe is just a command-line tool for compiling Java code, so it does not have a fancy icon.

Summary:

java.exe ☕ → Runs Java programs.
javac.exe 📄 → Compiles Java code (but no tea cup icon).

Set the JAVA_HOME Environment Variable

Open Environment Variables:
- Press Win + R, type sysdm.cpl, and hit Enter.
- Go to the Advanced tab and click Environment Variables.
Set JAVA_HOME:
- Under System Variables, click New (or Edit if JAVA_HOME exists).
- Enter:
  - Variable Name: JAVA_HOME
  - Variable Value: The path to your correct Java installation (e.g., C:\Java or C:\Program Files\Common Files\Oracle\Java\javapath).
Update the PATH Variable:
- In the System Variables section, find Path and click Edit.
- Click New and add %JAVA_HOME%\bin.

Verify the Configuration

Close and reopen the command prompt, then run:

echo %JAVA_HOME%

java -version

Ensure it displays the correct Java path and version.

Question - What are system variables in simple terms?

Ans. System variables are global environment variables in Windows that define system-wide settings, such as paths to important software like Java, Python, and Hadoop. They help programs locate required executables and libraries. or

System variables are like a toolbox label for a plumber. They tell the system where to find the right tools (software) so it can do its job properly.

Download and Install Java 11

Download Java 11 (LTS) from:
- Adoptium
- Oracle
Install it in a separate folder, e.g.,
C:\Program Files\Java\jdk-11

2. Set Up Environment Variables for Java 11

Open Environment Variables:
- Press Win + R, type sysdm.cpl, and hit Enter.
- Go to the Advanced tab → Click Environment Variables.
Modify JAVA_HOME for Java 11:
- Find JAVA_HOME in System Variables and edit it.

Set the value to Java 11's installation path:
makefile
CopyEdit
C:\Program Files\Java\jdk-11

Modify Path Variable:
- In System Variables, find Path and click Edit.

Add this new entry at the top:
%JAVA_HOME%\bin

- This ensures that the system picks Java 11 first when needed.

3. Switch Between Java Versions Manually

Whenever you want to switch between Java 11 and Java 22, use these commands:

Switch to Java 11 (for Hadoop)

set JAVA_HOME=C:\Program Files\Java\jdk-11

set PATH=%JAVA_HOME%\bin;%PATH%

Switch to Java 22 (if needed)

set JAVA_HOME=C:\java

set PATH=%JAVA_HOME%\bin;%PATH%

💡 These settings apply only to the current Command Prompt session.
For a permanent switch, update JAVA_HOME in Environment Variables.

4. Verify the Setup

After switching, check which Java version is active:

java -version

echo %JAVA_HOME%

It should now show Java 11 when set for Hadoop

Hadoop Installation

binary-aarch64: Hadoop compiled for ARM-based processors (e.g., Apple M1/M2, Raspberry Pi, some servers).
binary-lean: A lighter version of Hadoop with fewer dependencies, useful for minimal installations.
binary (default): The standard, full-featured Hadoop binary for most systems (use this one for Windows).
checksum/signature: Files used to verify integrity and ensure the download isn’t corrupted or tampered with.

What to do?

✅ Download the "binary" version (without -aarch64 or -lean) for your Windows setup.

Next Step:

Once the download is complete, extract the ZIP/TAR file to:

C:\Hadoop\Hadoop

Set HADOOP_HOME

Open Command Prompt (Win + R, type sysdm.cpl, and press Enter).
Go to the Advanced tab → Click Environment Variables.
Under System Variables, click New:
- Variable Name: HADOOP_HOME
- Variable Value: C:\Hadoop\Hadoop
Click OK to save.

✅ Confirm once done, and we’ll move to the next step! 🚀

You said:

done

ChatGPT said:

ChatGPT

Awesome! ✅ Now, let’s add Hadoop’s bin directory to the Path variable.

Step 2: Add Hadoop to System Path

Open Environment Variables again (Win + R → type sysdm.cpl → Enter).
Under System Variables, find and select Path, then click Edit.

Click New and add:

C:\Hadoop\Hadoop\bin

Click OK to save and close all windows.

✅ Confirm once done, and we’ll proceed to verifying the setup! 🚀

Great! ✅ Now, let's verify the Hadoop setup.

Step 3: Verify Hadoop Installation

Open Command Prompt (cmd).

Run the following command to check if Hadoop is recognized:

hadoop version

If everything is set up correctly, it should display the Hadoop version (e.g., Hadoop 3.4.1).

Answer from command prompt is that of the following: -

Hadoop 3.4.1

Source code repository https://github.com/apache/hadoop.git -r 4d7825309348956336b8f06a08322b78422849b1

Compiled by mthakur on 2024-10-09T14:57Z

Compiled on platform linux-x86_64

Compiled with protoc 3.23.4

From source with checksum 7292fe9dba5e2e44e3a9f763fce3e680

This command was run using /C:/Hadoop/Hadoop/share/hadoop/common/hadoop-common-3.4.1.jar

Next Step: Configure Hadoop

Now, we need to set up Hadoop for standalone mode and configure essential files.

Step 4: Configure HADOOP_CONF_DIR

Open Environment Variables (sysdm.cpl).
Under System Variables, click New and add:
- Variable Name: HADOOP_CONF_DIR
- Variable Value: C:\Hadoop\Hadoop\etc\hadoop
Click OK and close all windows.

Edit core-site.xml

Navigate to:
makefile
CopyEdit
C:\Hadoop\Hadoop\etc\hadoop

Open the file core-site.xml in Notepad or any text editor.

Replace the content with the following configuration:

<name>fs.defaultFS</name>

<value>hdfs://localhost:9000</value>

</property>

</configuration>

Step 6: Edit hdfs-site.xml

C:\Hadoop\Hadoop\etc\hadoop

Open hdfs-site.xml in Notepad or any text editor.

<name>dfs.replication</name>

</property>

<name>dfs.namenode.name.dir</name>

<value>C:\Hadoop\data\namenode</value>

</property>

<name>dfs.datanode.data.dir</name>

<value>C:\Hadoop\data\datanode</value>

</property>

</configuration>

The response from the command will be as follows: -

C:\Users\DEV>mkdir C:\Hadoop\data\namenode

A subdirectory or file C:\Hadoop\data\namenode already exists.

C:\Users\DEV>mkdir C:\Hadoop\data\datanode

A subdirectory or file C:\Hadoop\data\datanode already exists.

Edit mapred-site.xml

Navigate to:
makefile
CopyEdit
C:\Hadoop\Hadoop\etc\hadoop

Open mapred-site.xml in Notepad or any text editor.

Make sure it contains this configuration:
xml
CopyEdit
<configuration>

<name>mapreduce.framework.name</name>

</property>

</configuration>

Save and close the file.

Edit yarn-site.xml

Navigate to:
makefile
CopyEdit
C:\Hadoop\Hadoop\etc\hadoop

Open yarn-site.xml in Notepad or any text editor.

Replace its content with:
xml
CopyEdit
<configuration>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

</configuration>

Save and close the file

Basic commands?

Important commandscd windows [change directory]mkdir folder1 [create folder]sfc/scannow [check and repair files]cls [clear]del [delete folder]ping 192.168.100.1 ping dc1.ad.activedirectorypro.com ipconfig /all ipconfig /flushdns [/flushdns switch to clear the local DNS resolver cache.]shutdown /r /m \\pc2 [This command will restart computer pc2.]shutdown /s /m \\pc2 [This command will shut down the remote computer.]netstat -a [Display all active and listening ports. ]netstat -a -n [Display all connections in numerical order.]systeminfo [displays configuration details on a computer such as OS name and version, hardware information, boot time, logon server, and more. ]getmac /s pc1 [quickly grab a computer’s MAC address. ]whoami [who is currently logged on to the local system ]

What is a dataframe?

import pandas as pd

# Create a DataFrame (your digital recipe book)

df = pd.DataFrame({

"Dish": ["Pizza", "Burger", "Pasta"],

"Ingredients": ["Cheese, Dough", "Patty, Bun", "Noodles, Sauce"],

"Time to Cook (mins)": [20, 15, 25],

"Calories": [500, 400, 450]

})

# View the DataFrame

print(df)

What is a library? Pandas & matplotlib?

What is a Library?

A library in Python is like a set of specialized kitchen tools for a chef.

Imagine you’re a chef, and you have a regular kitchen with basic tools like knives and pans (Python’s built-in tools).
But what if you want to make fancy dishes, like sushi or soufflé? You’d need special tools like a rice cooker, pastry blender, or sous-vide machine.

In Python, libraries are those specialized tools that help you do specific tasks:

Pandas: Your tool for organizing and managing recipes (data).
Matplotlib: Your tool for presenting your dishes beautifully (visualizations).

What is Pandas (to a Chef)?

Pandas is like your recipe organizer.

It’s your kitchen assistant who helps you keep all your recipes (data) in a neat and tidy table (DataFrame).
You can:
- Add recipes (rows of data).
- Edit ingredients (columns of data).
- Find and filter recipes (querying data).
- Sort recipes (organize data by time or calories).

Example for a Chef

Let’s say you’re working with a digital recipe book:

import pandas as pd # Create a DataFrame (recipe book) recipes = pd.DataFrame({ "Dish": ["Pizza", "Burger", "Pasta"], "Calories": [500, 400, 450], "Cooking Time (mins)": [20, 15, 25] }) # View your recipes print(recipes)

Output:

Dish Calories Cooking Time (mins) 0 Pizza 500 20 1 Burger 400 15 2 Pasta 450 25

With Pandas, you can:

Sort by calories:
recipes.sort_values(by="Calories")
Find dishes with less than 20 minutes cooking time:
recipes[recipes["Cooking Time (mins)"] < 20]

Pandas = your digital sous-chef!

What is Matplotlib (to a Chef)?

Matplotlib is like your presentation toolkit.

After you cook your dishes, you want to present them beautifully.
Matplotlib is your tool for:
- Making beautiful food presentations (bar charts, pie charts, line charts).
- Highlighting which dishes are the most popular or healthiest.

Example for a Chef

Let’s visualize the calories of your dishes:

import matplotlib.pyplot as plt # Dishes and their calories dishes = ["Pizza", "Burger", "Pasta"] calories = [500, 400, 450] # Create a bar chart plt.bar(dishes, calories, color="lightgreen", edgecolor="black") plt.title("Calories in Each Dish") plt.xlabel("Dish") plt.ylabel("Calories") plt.show()

Output:

A bar chart showing the calories of each dish.

Matplotlib = your plating and presentation artist!

Summary

Library: Your set of specialized tools (like kitchen gadgets).
Pandas: Your assistant for managing data (like a digital recipe book).
Matplotlib: Your presentation expert for showcasing data beautifully (like plating dishes).

Reasons of BDAA for Admin Officers?

The burden of Big Data analysis is poised to fall heavily on administrative officers due to the increasing adoption of data-driven decision-making in governance. Training them now can preemptively equip them with the necessary skills to effectively leverage data for public administration. Here are the reasons, examples, and solutions:

Reasons for Increased Burden

Data-Driven Policy Making:
- Governments are adopting evidence-based policies to address complex issues like poverty, health, and urban planning.
- Example: The Indian government’s Aspirational Districts Programme uses data to track development metrics in backward districts.
- Implication: Administrative officers will need to interpret data to ensure accurate tracking and intervention.
Emergence of Digital Infrastructure:
- Initiatives like Digital India and Smart Cities Mission generate vast amounts of data through IoT, sensors, and citizen feedback systems.
- Example: Smart Cities dashboards provide real-time data on traffic, water usage, and waste management.
- Implication: Officers will need to extract actionable insights from these dashboards.
Increased Accountability and Transparency:
- Right to Information (RTI) Act demands data transparency, increasing the need for organized data analysis.
- Example: Officers managing e-Governance portals must analyze citizen grievance trends and ensure quick redressal.
- Implication: Without proper training, officers might struggle with compliance and responsiveness.
Integration of Emerging Technologies:
- Technologies like AI, ML, and Big Data analytics are being integrated into government systems for predictive modeling.
- Example: AI is used in crop yield predictions to support farmers.
- Implication: Officers must understand the basics of these technologies to collaborate effectively with technical teams.
Rapid Expansion of Citizen Databases:
- The use of massive databases like Aadhaar, PM Kisan, and Ujjwala Yojana requires in-depth understanding of data handling.
- Implication: Officers will increasingly manage, analyze, and safeguard sensitive citizen data.

Examples of Big Data in Governance

Health:
- Example: During the COVID-19 pandemic, Big Data helped analyze infection trends, vaccination progress, and resource allocation.
- Challenge: Many officers were unprepared to handle the scale and complexity of data during emergencies.
Education:
- Example: Data from Diksha Portal and UDISE+ is used to monitor student outcomes and teacher performance.
- Challenge: Lack of trained officers delays effective interventions.
Environment:
- Example: Pollution monitoring systems in Delhi use data from sensors for air quality management.
- Challenge: Interpreting this data to enforce actionable measures requires analytical skills.

Solutions

Comprehensive Training Programs:
- Organize workshops on Big Data fundamentals:
  - Data cleaning and visualization.
  - Tools like Tableau, Power BI, or Python basics.
- Example: ISTM or state administrative institutes can lead these efforts.
Hands-On Practice:
- Provide real-world datasets (e.g., rainfall, health indicators) for officers to analyze in training.
- Example: Create simulated governance scenarios for decision-making.
Collaborations with Technical Experts:
- Foster partnerships with institutions like IITs, NITs, and NIC to upskill officers in data analytics.
- Example: Data analytics courses tailored for administrative officers.
Creation of Dedicated Data Teams:
- Establish Big Data cells in each department with trained officers supported by IT professionals.
- Example: Data-driven cells in Smart City projects.
Policy-Level Integration:
- Mandate Big Data proficiency as a required skill in promotion exams for officers.
- Example: Incorporate data-related case studies in administrative training.
Incentivizing Continuous Learning:
- Offer certifications or rewards for officers excelling in Big Data analysis.
- Example: Certification programs on Coursera, or government-recognized courses.

Benefits of Training Now

Proactive Problem Solving: Officers can anticipate trends and take timely actions.
Efficient Resource Allocation: Data insights will improve governance efficiency.
Enhanced Career Progression: Trained officers will be better equipped for senior roles.
Public Trust: Data-driven governance fosters transparency and trust.

Conclusion

Big Data is reshaping governance. By training administrative officers now, the government can ensure smoother transitions into data-driven frameworks, more efficient public service delivery, and better policy outcomes. Investing in their capabilities will have a long-lasting impact on governance quality and citizen satisfaction.

5V of Big Data

Tools of Big Data Analysis?

What is Big Data & Relevancy in Governance?

- Definition: Big Data refers to large, complex datasets that cannot be processed using traditional methods.
- Characteristics (5 Vs):
  1. Volume: Massive amounts of data (e.g., citizen records, economic data).
  2. Velocity: Data generated at high speed (e.g., real-time traffic data).
  3. Variety: Different types (structured: spreadsheets; unstructured: emails, images).
  4. Veracity: Reliability of data (e.g., social media rumors vs. authentic data).
  5. Value: Insights derived (e.g., better policymaking).
Example:
- "Imagine analyzing millions of Aadhaar transactions to find patterns in healthcare access."

Big Data in Governance?

Monitoring welfare schemes (e.g., MNREGA fund usage).
Analyzing traffic data for urban planning.
Detecting fraud in subsidies using data patterns.
Using weather data to prepare for disasters.

Exercise:

Data: Provide a mock dataset of 1,000 citizen transactions with fields like Transaction ID, Amount, Date, Region, and Scheme.
Task: Ask participants to identify anomalies (e.g., unusually high claims in a specific region).

Tools for Big Data Analysis?

Microsoft Excel (basic), Tableau, Hadoop, Spark.

Region, Scheme, Beneficiaries, Funds Utilized (in Lakhs)

North, MNREGA, 5000, 300

South, MNREGA, 4000, 250

East, MNREGA, 4500, 275

West, MNREGA, 4800, 290

Task:
- Create a pivot table to summarize total beneficiaries and funds by region.
- Identify which region utilized funds most efficiently.

Big Data success in India?

COVID-19 Management: Explain how the Aarogya Setu app used Big Data to track infections and allocate resources.
Example: "Data collected from millions of app users helped in identifying infection clusters and planning containment zones."

Departments in Government of India requiring Big Data Analysis, website or interface to learn about Big Data in Governance, Government of India Initiatives in Big Data Analysis?

https://ndap.niti.gov.in/

Several departments within the Government of India actively utilize big data analytics to enhance decision-making and service delivery:

Ministry of Finance: Employs data analytics to improve tax administration and compliance. Both direct and indirect tax departments utilize big data and AI/ML techniques to make tax administration more effective and taxpayer-friendly.
Ministry of Electronics and Information Technology (MeitY): Under the Digital India programme, MeitY has launched numerous e-services and initiatives that leverage big data to promote innovation and improve governance.
NITI Aayog: Through its Data Management and Analysis Vertical, NITI Aayog addresses issues related to streamlining data usage for public policy. A key initiative is the National Data and Analytics Platform (NDAP), which democratizes access to public government data by making it accessible, interoperable, and interactive.
National Informatics Centre (NIC): Provides data analytics services to various government departments, aiding in the development of e-Government applications and managing the National Knowledge Network.

To learn more about big data in governance and the Government of India's initiatives in big data analysis, the following platforms offer valuable resources:

Open Government Data (OGD) Platform India: A single-point access to datasets, documents, services, tools, and applications published by various government ministries and departments. It facilitates data sharing and promotes innovation over non-personal data.
National Data and Analytics Platform (NDAP): Developed by NITI Aayog, NDAP aims to democratize access to public government data by making it accessible, interoperable, and interactive. It hosts datasets from various government agencies and provides tools for analytics and visualization.

These platforms serve as comprehensive resources for understanding and engaging with the Government of India's big data initiatives.

Several departments within the Government of India actively utilize big data analytics to enhance decision-making and service delivery:

Ministry of Finance: Employs data analytics to improve tax administration and compliance. Both direct and indirect tax departments utilize big data and AI/ML techniques to make tax administration more effective and taxpayer-friendly.
Ministry of Electronics and Information Technology (MeitY): Under the Digital India programme, MeitY has launched numerous e-services and initiatives that leverage big data to promote innovation and improve governance.
NITI Aayog: Through its Data Management and Analysis Vertical, NITI Aayog addresses issues related to streamlining data usage for public policy. A key initiative is the National Data and Analytics Platform (NDAP), which democratizes access to public government data by making it accessible, interoperable, and interactive.
National Informatics Centre (NIC): Provides data analytics services to various government departments, aiding in the development of e-Government applications and managing the National Knowledge Network.

To learn more about big data in governance and the Government of India's initiatives in big data analysis, the following platforms offer valuable resources:

Open Government Data (OGD) Platform India: A single-point access to datasets, documents, services, tools, and applications published by various government ministries and departments. It facilitates data sharing and promotes innovation over non-personal data.
National Data and Analytics Platform (NDAP): Developed by NITI Aayog, NDAP aims to democratize access to public government data by making it accessible, interoperable, and interactive. It hosts datasets from various government agencies and provides tools for analytics and visualization.

These platforms serve as comprehensive resources for understanding and engaging with the Government of India's big data initiatives.

Step wise practical on data by Govt of India?

Access Government Data

Platforms to Access Data:
1. Open Government Data (OGD) Platform India: data.gov.in
2. National Data and Analytics Platform (NDAP): ndap.niti.gov.in
Steps:
1. Visit the website.
2. Browse or search for datasets by categories (e.g., education, health, finance).
3. Download datasets in formats like CSV, JSON, or Excel.

Understand the Data

Steps:
- Read the metadata or data dictionary provided with the dataset.
- Understand key variables, data types, and the context of the dataset.
- Clean and preprocess the data for analysis using tools like Excel or Python libraries.
Recommended Tools:
- Spreadsheet Software: Microsoft Excel, Google Sheets
- Python Libraries: Pandas, NumPy

Explore the Data

Steps:
- Perform descriptive statistics to summarize the data.
- Create visualizations to identify patterns and trends.
- Use tools to filter and query data for specific insights.
Recommended Tools:
- Visualization: Tableau Public, Power BI, Matplotlib, Seaborn (Python)
- Querying Data: SQL, Python (Pandas)

Perform Advanced Analysis

Steps:
- Use statistical methods for deeper insights.
- Apply machine learning models for predictions or classifications.
- Use geographic data for spatial analysis.
Recommended Tools:
- Statistical Analysis: R, Python (Statsmodels, Scikit-learn)
- GIS and Spatial Analysis: QGIS, ArcGIS, Google Earth Engine

Generate Insights

Steps:
- Compile insights into reports or dashboards.
- Share findings with stakeholders or the public.
Recommended Tools:
- Dashboards: Power BI, Tableau, Streamlit (Python)
- Documenting: Jupyter Notebooks, Google Docs

Step wise practical on Govt of India data?

Download a Dataset

Go to the Open Government Data (OGD) Platform India: https://data.gov.in.
Search for a dataset. For example:
- Dataset: "State-wise Monthly Rainfall Data (2004-2020)"
- Link: State-wise Rainfall Data
Download the dataset as a CSV file.

Set Up Tools

Use Google Colab (no installation required) or Jupyter Notebook.
Google Colab: Open Google Colab and create a new notebook.
Jupyter Notebook: If installed, launch it from your computer.

Upload the Dataset

Open your notebook in Google Colab or Jupyter.
Upload the CSV file.
- Google Colab: Use the upload button in the sidebar or write the code to upload.

from google.colab import files

uploaded = files.upload()

Save the file in your working directory if using Jupyter.

Write Python Code

Here is a complete script to showcase your big data analysis. Copy and paste it into your notebook.

Python Code:

# Importing necessary libraries

import pandas as pd

import matplotlib.pyplot as plt

# Step 1: Load the dataset

# Replace 'rainfall.csv' with the name of your downloaded file

file_name = "rainfall.csv"

data = pd.read_csv(file_name)

# Step 2: Display the first few rows of the dataset

print("Dataset Overview:")

print(data.head())

# Step 3: Summary of data

print("\nDataset Summary:")

print(data.describe())

# Step 4: Clean the data (e.g., handling missing values)

print("\nChecking for missing values:")

print(data.isnull().sum())

# Dropping rows with missing values

data_cleaned = data.dropna()

# Step 5: Analyze data

# Example: Calculate average annual rainfall per state

average_rainfall = data_cleaned.groupby("State")["Annual Rainfall"].mean()

print("\nAverage Annual Rainfall per State:")

print(average_rainfall)

# Step 6: Visualize the data

# Example: Bar plot of average rainfall

average_rainfall.plot(kind="bar", figsize=(12, 6))

plt.title("Average Annual Rainfall by State")

plt.xlabel("State")

plt.ylabel("Average Rainfall (mm)")

plt.tight_layout()

plt.show()

Explain Your Steps During the Session

Load Data: Show participants how you loaded the dataset into the notebook.
Overview: Explain the dataset structure using head() and describe().
Data Cleaning: Discuss why handling missing values is crucial.
Analysis: Show how you calculated the average rainfall for each state.
Visualization: Explain how the bar chart represents the results

Practice the Steps

Run this practical multiple times before the session to ensure smooth execution.
Familiarize yourself with the dataset and outputs so you can confidently explain the results.

Value and Volume in Big Data?

1. Value in Big Data

Value refers to the usefulness and insights derived from analyzing Big Data. It focuses on how the data can be transformed into meaningful and actionable information, leading to better decision-making, innovation, and business opportunities. Not all data is valuable, so the challenge is to extract value from the massive amounts of data being collected.

Examples of Value in Big Data:

Healthcare: Predictive Analytics for Patient Care
- Example: Hospitals collect massive amounts of patient data (medical history, diagnostics, lab results). Using Big Data analytics, healthcare providers can identify patterns in this data to predict patient health outcomes. For instance, by analyzing vital signs and historical records, hospitals can predict when patients might need readmission or when certain complications might arise. This improves patient care and reduces costs.
- Value: The value here is derived from saving lives, improving care quality, and reducing hospital readmissions.
E-commerce: Personalized Recommendations
- Example: Platforms like Amazon collect vast amounts of data about customer preferences, browsing behavior, and purchase history. By analyzing this data, Amazon can provide personalized product recommendations to customers, increasing sales and improving the user experience.
- Value: The value for the business is in increased revenue and customer satisfaction through more targeted marketing and recommendations.
Finance: Fraud Detection
- Example: Banks and financial institutions collect data on millions of transactions daily. By analyzing patterns in this data, Big Data tools can detect fraudulent activities in real-time, such as unusual transaction amounts, locations, or behavior.
- Value: The value lies in protecting customers from fraud, saving money for the bank, and maintaining trust.
Supply Chain Optimization
- Example: A retail company can use Big Data to monitor its entire supply chain in real-time, adjusting inventory levels based on consumer demand, sales data, and delivery patterns.
- Value: Optimizing supply chains leads to reduced costs, faster deliveries, and improved customer satisfaction.

The main takeaway is that Value in Big Data is about turning raw data into insights that can help businesses or organizations make smarter decisions.

2. Velocity in Big Data

Velocity refers to the speed at which data is generated, collected, and processed. In Big Data, data is produced at an extremely fast rate, often in real-time, and it needs to be processed quickly to extract timely insights and take action.

Examples of Velocity in Big Data:

Social Media Monitoring (Twitter, Facebook)
- Example: Twitter processes around 500 million tweets per day (about 6,000 tweets per second). Social media platforms must analyze data in real-time to detect trends, monitor user behavior, and respond to emerging topics or crises (e.g., viral trends, breaking news).
- Velocity: The velocity here is high because the data is being generated and analyzed in real-time, often requiring instant processing to capitalize on trends or prevent the spread of misinformation.
Stock Market Trading
- Example: Financial markets generate an enormous amount of data in real-time, with millions of transactions happening every second. Big Data algorithms analyze this information in real-time to enable high-frequency trading (HFT), where traders make buying and selling decisions based on market fluctuations in milliseconds.
- Velocity: In this case, velocity is extremely high, as data must be processed in real-time to gain a competitive edge and make split-second financial decisions.
Smart Cities and IoT
- Example: In a smart city, IoT sensors collect real-time data about traffic flow, air quality, electricity usage, and water systems. To manage city services efficiently, this data must be analyzed in real-time to adjust traffic lights, optimize energy usage, or predict maintenance needs.
- Velocity: The velocity is high because the data is continuously streaming, and decisions must be made in real-time to keep city systems running smoothly.
Real-Time Fraud Detection
- Example: Credit card companies like Visa and Mastercard process thousands of transactions per second. To detect and prevent fraud, they need to analyze these transactions in real-time, identifying suspicious activities as they occur.
- Velocity: Fraud detection systems rely on high-velocity data processing to identify and stop fraudulent transactions as they happen.
Streaming Services (Netflix, YouTube)
- Example: Platforms like Netflix and YouTube collect data in real-time about user behavior, such as what people are watching, how long they stay on the platform, and what content they interact with. This data is analyzed in real-time to provide personalized recommendations and to improve user experience.
- Velocity: The velocity is high because data is generated continuously while users are interacting with the platform, requiring immediate processing.

Summary of Value and Velocity in Big Data

Value: Refers to how useful the data is in generating insights that lead to better decision-making and outcomes. Examples include personalized recommendations (e-commerce), predictive analytics (healthcare), and fraud detection (finance).
Velocity: Refers to the speed at which data is generated and processed, often in real-time. Examples include social media monitoring (Twitter), financial trading (stock markets), and IoT data streams (smart cities).

Both Value and Velocity are crucial in Big Data analysis. Value is what you get from data insights, while Velocity is how fast you need to process the data to get those insights in time for them to be useful.

What is Veracity in 5V of Big Data

In the context of the 5 V’s of Big Data, Veracity refers to the quality and reliability of the data. It emphasizes the importance of ensuring that the data being processed is trustworthy, accurate, and meaningful. In Big Data, since data comes from multiple sources and is generated at a high speed, there is often uncertainty or inconsistency regarding its quality. Not all the data collected is accurate, complete, or reliable, which can make analysis difficult.

Explaining "Veracity" in Big Data

Veracity in Big Data highlights:

Data quality: Is the data clean and accurate, or does it have errors, noise, or inconsistencies?
Trustworthiness: Can the data be trusted to make decisions? Is it coming from reliable sources?
Uncertainty: How confident can you be in the insights derived from the data, given its possible flaws or bias?

The veracity of Big Data is often one of the most challenging aspects because inaccurate or low-quality data can lead to faulty conclusions or poor decision-making.

Examples to Illustrate Veracity

Here are some practical examples of Veracity to make the concept clearer:

1. Social Media Data (Low Veracity)

Example: Consider a social media platform like Twitter, where millions of posts are generated every minute. While a lot of valuable insights can be derived from social media data (e.g., public sentiment, trends), not all data is reliable.
- Issues: Some posts may contain spam, fake news, irrelevant content, or even misinformation. For example, bots generating fake comments, or biased opinions from people trying to manipulate trends, can affect the trustworthiness of the analysis.
- Challenge: The platform needs to filter out this low-quality data (noise) and focus on authentic interactions to derive meaningful insights. This uncertainty about the reliability of user-generated content is an example of low veracity in Big Data.

2. IoT Devices (Sensor Data)

Example: A smart city relies on sensors for monitoring traffic, pollution, weather, and more. These sensors generate large amounts of data, but sometimes, the data might be incomplete or inaccurate.
- Issues: Sensors can malfunction, sending inconsistent or corrupted data (e.g., temperature sensors recording faulty readings due to weather conditions). Different sensors might also produce conflicting data for the same location.
- Challenge: When analyzing the data, it’s critical to filter out incorrect or inconsistent readings to make accurate decisions. The uncertainty about the correctness of the data captured from IoT devices reflects veracity issues.

3. Customer Data in E-commerce

Example: An online retailer collects data about customers, including purchasing habits, preferences, and demographics.
- Issues: Some customers might enter incorrect information (such as wrong email addresses or fake phone numbers). Additionally, duplicate or incomplete records (e.g., customers having multiple accounts or missing information in their profiles) can reduce data quality.
- Challenge: Ensuring that the customer data is clean, deduplicated, and accurate is essential to make trustworthy business decisions, like personalized recommendations or targeted marketing. The reliability and accuracy of customer profiles reflect the veracity of the data.

4. Financial Data (High Veracity Needed)

Example: In the financial industry, banks and trading platforms collect vast amounts of data about transactions, stock prices, and economic indicators.
- Issues: Small errors in this data (e.g., incorrect transaction records, duplicate entries, or delays in real-time updates) can have major financial consequences. If inaccurate data is used for trading algorithms or investment decisions, it can lead to significant losses.
- Challenge: Financial data needs to be precise, accurate, and real-time to ensure reliable decisions. This is why the veracity of data is critical in the financial sector, where even minor errors can lead to large-scale impacts.

Dealing with Veracity Issues

To deal with veracity in Big Data, companies often implement:

Data cleansing techniques: Removing duplicates, correcting errors, and filling in missing information to improve data quality.
Filtering unreliable sources: Detecting and eliminating data from low-trust sources, such as fake accounts or spam.
Consistency checks: Ensuring that data coming from different sources or sensors is consistent with expected ranges or known facts.

For example, in e-commerce, customer data can be cross-checked with third-party services to verify accuracy, and social media platforms may use algorithms to flag or remove misleading or irrelevant content.

Summary of Veracity in Big Data

Veracity in Big Data refers to the quality, reliability, and trustworthiness of the data. It's about ensuring that the data you’re analyzing is accurate and clean because poor-quality data leads to faulty insights. High veracity means data is consistent, error-free, and can be trusted, while low veracity means the data is uncertain, inconsistent, or potentially misleading.

Examples:

Social media data often suffers from low veracity due to misinformation or spam.
IoT sensors can produce inconsistent or faulty readings that need to be filtered.
Customer data in e-commerce may contain errors or duplicates, affecting data quality.
Financial data requires high veracity to avoid costly mistakes in real-time decision-making.

Ensuring high veracity is a critical step in any Big Data project to guarantee accurate and valuable insights.

What is Volume in 5V of Big Data?

In the context of the 5 V’s of Big Data, the Volume refers to the sheer amount of data being generated, stored, and processed. It highlights the vast quantity of data that comes from various sources, often measured in terabytes (TB), petabytes (PB), or even zettabytes (ZB). Volume is often considered the defining characteristic of Big Data, as traditional databases and systems struggle to handle the large scale.

Explaining "Volume" in Big Data

Volume in Big Data means "how much" data is generated. This data can come from a variety of sources:

Social media posts: Every day, platforms like Facebook, Twitter, and Instagram generate billions of posts, likes, comments, and shares.
Sensor data: In a smart city, millions of sensors on traffic lights, buildings, and streets constantly send streams of data about temperature, traffic, air quality, etc.
Transactional data: E-commerce platforms like Amazon process millions of transactions per minute, logging customer details, purchases, reviews, and more.

These examples show the massive volume that Big Data systems handle daily. The critical challenge is storing and processing this data efficiently.

Example to Illustrate Volume

Let’s say you’re explaining the Volume aspect to someone:

Example 1: Social Media (Facebook)
- Volume: Facebook generates 4 petabytes of data daily, including likes, shares, posts, photos, and videos.
- Explanation: Imagine how much data that is—it's equivalent to 1 million HD movies being stored every day. Traditional databases wouldn’t be able to handle this volume, but Big Data systems like Hadoop or Spark can process such a massive amount of data efficiently.
Example 2: E-commerce (Amazon)
- Volume: Amazon generates around 1 million transactions per second during peak times, producing terabytes of customer purchase data, product reviews, recommendations, and delivery information.
- Explanation: To make it relatable, you could say that in one minute, Amazon handles as much data as a small company might handle in an entire year!
Example 3: Internet of Things (IoT)
- Volume: In a smart city, IoT devices (smart meters, traffic sensors, air quality monitors) generate terabytes of data per day. Each sensor continuously sends data on energy usage, vehicle movement, or environmental conditions.
- Explanation: Think of this as the city "talking" to itself—millions of sensors collecting information simultaneously, and all this data needs to be processed in real-time.

Summary of Volume in Big Data

The volume in Big Data refers to the massive amount of data generated by various digital processes, devices, and interactions. As technology advances, this volume continues to grow exponentially.

For citing examples of volume:

Social media generates 4 petabytes of data daily.
E-commerce platforms like Amazon handle 1 million transactions per second.
Smart cities produce terabytes of sensor data every day from IoT devices.

These examples clearly demonstrate how the volume of Big Data is far beyond the capabilities of traditional data systems, necessitating specialized tools and techniques for managing such enormous datasets.

Practical on Big Data Analysis?

To define Big Data Analysis and perform a practical demonstration of it on any computer with an internet connection, you can leverage cloud-based services, open-source tools, and datasets available online. Here’s how you can approach this:

1. Defining Big Data Analysis

Big Data Analysis refers to the process of examining large and complex datasets, often involving millions or billions of records. These datasets are so large that traditional data processing tools struggle to handle them. Big Data Analysis involves extracting useful information from this data to identify patterns, trends, correlations, and insights. It typically involves:

Volume: Huge amounts of data.
Variety: Different types of data (text, video, images, structured/unstructured data).
Velocity: Speed at which data is generated and processed.
Veracity: Trustworthiness of data.
Value: Insights and actionable information extracted from the data.

2. Tools and Technologies for Big Data Analysis

To perform a practical Big Data analysis, you can use the following tools and coding environments that are accessible via the internet:

a) Python and Libraries for Big Data Analysis

Python is a versatile language for Big Data analysis with its rich ecosystem of libraries.
- Pandas: For data manipulation and analysis.
- NumPy: For numerical data processing.
- Dask: For parallel computing and handling larger-than-memory datasets.
- PySpark: Python API for Apache Spark, which is a popular distributed computing framework.
- Matplotlib/Seaborn: For data visualization.
You can run Python code locally or via cloud-based environments (e.g., Google Colab or Jupyter Notebooks).

b) Apache Spark (via PySpark)

Apache Spark is one of the most powerful tools for Big Data processing. It's fast, and built for distributed computing. You can use PySpark to work with Spark in Python. This is suitable for analyzing huge datasets using a distributed computing model.
You can use cloud services like Google Colab, which allows running PySpark code without needing local setup.

c) Cloud Platforms

If you want to explore more serious Big Data analysis:

Google Cloud (BigQuery): Offers a managed data warehouse with massive scale and SQL-based querying.
Amazon Web Services (AWS S3 + EMR): Store data in S3 and use Elastic MapReduce for Big Data analysis using Hadoop or Spark.
Azure Data Lake Analytics: For scalable cloud-based analysis.

3. Performing a Practical Big Data Analysis

Here’s how you can run a Big Data Analysis practical using publicly available datasets and tools accessible from any internet-connected computer.

Step 1: Get a Dataset

You need a large dataset to define Big Data. Some useful sources:

Kaggle: Free datasets related to various domains.
- Example: Kaggle Datasets
Google Dataset Search: Search engine for finding large datasets.
Open Data Portals: Websites like data.gov or European Union Open Data Portal provide free public datasets.

For example, you could use:

New York City Taxi Trip Data: Over 1 billion records of taxi rides in NYC.
Global Temperature Data: Huge datasets for climate research.
COVID-19 Data: Large-scale, globally collected datasets.

Step 2: Set Up an Environment (Google Colab for Example)

Go to Google Colab. This is a free, cloud-based Python environment with access to many Big Data libraries.
Install necessary libraries (Pandas, PySpark, Dask, etc.).

Step 3: Analyze the Data

Here’s an example of Big Data analysis using PySpark in Google Colab.

Install PySpark on Colab:
bash
Copy code
!pip install pyspark

Load and Analyze Data (using NYC taxi data as an example):
python
Copy code
from pyspark.sql import SparkSession

# Initialize Spark session

spark = SparkSession.builder.appName("BigDataAnalysis").getOrCreate()

# Load data (assuming NYC taxi data is available as a CSV on cloud storage or local drive)

df = spark.read.csv("/path-to-your-nyc-taxi-data.csv", header=True, inferSchema=True)

# Show data schema

df.printSchema()

# Example: Calculate average trip distance

df.select('trip_distance').summary('mean').show()

Run Queries and Visualize
- Filter data based on conditions (e.g., find all trips over 10 miles).
- Visualize patterns such as trip duration vs distance using matplotlib/seaborn.
Summarize Findings: You can summarize the findings in terms of trends or anomalies discovered, such as peak travel times, average fare per distance, or other insights.

Step 4: Interpretation and Reporting

After running the practical analysis, you can report:

The size and variety of the dataset (to demonstrate the "Big" part of Big Data).
The velocity at which insights were derived using PySpark.
The veracity of the data and whether any cleaning was required.
The value in terms of insights (e.g., how the data analysis can help optimize services like taxis or resource allocation).

Practical can be done on elements of data analytics (data source, data lake, data processing, data analysis, data interpretation and its use) with example and resources

1. Data Source

The data source is where the data originates. It can come from multiple sources like databases, APIs, sensors, or publicly available datasets.

Example: NYC Taxi Dataset

Source: The NYC Taxi & Limousine Commission provides an open dataset that records every trip made by a taxi in NYC. You can download data from NYC Taxi Data.

Practical Task:

Go to the website and download a month's worth of NYC Taxi data (e.g., January 2023).

You can also directly import it into Google Colab:
python
Copy code
!wget https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2023-01.parquet

2. Data Lake

A data lake is a centralized repository that allows you to store structured and unstructured data at scale. A popular tool for this is AWS S3, but you can also use Google Cloud Storage or even local storage on your system or Google Colab environment for simplicity.

Example with Google Colab or AWS S3:

Local Data Lake (Colab): The dataset you downloaded can be considered part of a local "data lake" in your Google Colab environment.
AWS S3 for Data Lake:
- Create an AWS S3 bucket.
- Upload the NYC Taxi dataset (or other datasets).
- Use AWS CLI or Boto3 to access the data from your data lake programmatically.

python
Copy code
import boto3

# Access your AWS S3 bucket

s3 = boto3.resource('s3')

bucket = s3.Bucket('your-bucket-name')

# Download the dataset

bucket.download_file('your-file.csv', 'local-filename.csv')

3. Data Processing

Data Processing is the step where raw data is cleaned and transformed into a more usable format. This typically involves removing errors, missing values, and outliers, as well as transforming the data into formats that are more suitable for analysis.

Example: Data Cleaning and Transformation with Pandas:

Data Cleaning: Use Pandas to handle missing data and remove irrelevant columns from the dataset.
python
Copy code
import pandas as pd

# Load the dataset

df = pd.read_parquet("yellow_tripdata_2023-01.parquet")

# Check for missing data

print(df.isnull().sum())

# Remove missing values

df_cleaned = df.dropna()

# Remove irrelevant columns

df_cleaned = df_cleaned[['tpep_pickup_datetime', 'tpep_dropoff_datetime', 'passenger_count', 'trip_distance', 'total_amount']]

Data Transformation: Convert columns (e.g., timestamps) into useful formats and create new columns if necessary (e.g., trip duration).
python
Copy code
# Convert pickup and dropoff times to datetime

df_cleaned['pickup_datetime'] = pd.to_datetime(df_cleaned['tpep_pickup_datetime'])

df_cleaned['dropoff_datetime'] = pd.to_datetime(df_cleaned['tpep_dropoff_datetime'])

# Create a new column for trip duration in minutes

df_cleaned['trip_duration'] = (df_cleaned['dropoff_datetime'] - df_cleaned['pickup_datetime']).dt.total_seconds() / 60

4. Data Analysis

Data Analysis involves applying statistical and machine learning techniques to discover patterns or trends in the data.

Example: Basic Descriptive Statistics and Visualizations:

Descriptive Statistics: Calculate basic statistics such as mean, median, and distribution of trips and fares.
python
Copy code
# Descriptive statistics for trip distance and total amount

df_cleaned[['trip_distance', 'total_amount']].describe()

Data Visualization: Use Matplotlib and Seaborn to visualize data, like the distribution of trip distances or total fare amounts.
python
Copy code
import matplotlib.pyplot as plt

import seaborn as sns

# Plot distribution of trip distances

plt.figure(figsize=(10,6))

sns.histplot(df_cleaned['trip_distance'], bins=50, kde=False)

plt.title('Distribution of Trip Distances')

plt.xlabel('Trip Distance (miles)')

plt.ylabel('Frequency')

plt.show()

Advanced Analysis: You can apply machine learning models to predict the fare based on trip distance, passenger count, etc., using libraries like Scikit-Learn.

5. Data Interpretation

Data Interpretation involves understanding the results from the data analysis and drawing meaningful conclusions. This step is key to turning raw data into insights that can drive decision-making.

Example: Insights from NYC Taxi Data:

Insight 1: Discover the busiest hours for NYC taxis based on pickup times.
python
Copy code
df_cleaned['hour'] = df_cleaned['pickup_datetime'].dt.hour

hourly_trips = df_cleaned.groupby('hour').size()

hourly_trips.plot(kind='bar', figsize=(10,6), title='Number of Taxi Trips per Hour')

plt.xlabel('Hour of Day')

plt.ylabel('Number of Trips')

plt.show()

Interpretation: The plot might show a peak during rush hours (8-9 AM, 5-6 PM), indicating the busiest periods for taxi drivers.

Insight 2: Determine the correlation between trip distance and total fare.
python
Copy code
# Calculate correlation

correlation = df_cleaned[['trip_distance', 'total_amount']].corr()

print("Correlation between trip distance and total fare:\n", correlation)

Interpretation: A positive correlation would suggest that longer trips tend to generate higher fares, but analyzing outliers (like very short trips with high fares) can offer deeper insights.

6. Data Use

Data Use refers to applying the insights to solve real-world problems or improve decision-making. This could be in business strategy, operations, customer service, etc.

Example: Taxi Fleet Optimization:

Problem: A taxi company might want to optimize its fleet based on the busiest times and locations.
Use Case: Using the insights from the analysis, the company could:
1. Increase the number of taxis available during peak hours.
2. Position taxis near high-demand drop-off locations.
3. Adjust fare pricing dynamically based on demand.

Sharing the Insights:

You can export your cleaned dataset or analysis to a CSV for sharing or further analysis.

python

Copy code

# Save the processed and analyzed data to a CSV

df_cleaned.to_csv('processed_nyc_taxi_data.csv', index=False)

# Download the CSV file

from google.colab import files

files.download('processed_nyc_taxi_data.csv')

Resources:

Here are some useful resources for each stage:

Data Sources:
- NYC Taxi Data: NYC Taxi & Limousine Commission Data
- Kaggle Datasets: Kaggle
Data Lake:
- AWS S3: Amazon S3
- Google Cloud Storage: Google Cloud Storage
Data Processing:
- Pandas Documentation: Pandas
- AWS Glue (Data Processing Service): AWS Glue
Data Analysis:
- Matplotlib Documentation: Matplotlib
- Seaborn Documentation: Seaborn
Data Interpretation & Use:
- Data Science for Business: Data Science for Business Book

By following these steps, you can practically demonstrate the entire Data Analytics Pipeline with an example dataset. You will have touched on all key elements, from sourcing the data to drawing valuable insights and making data-driven decisions.

My SQL coding

SQL Datatypes

CHAR(50)
VARCHAR(50)
BLOB(1000)
INT
TINYINT
BIGINT
BIT
FLOAT
DOUBLE DECIMAL NUMBER
BOOLEAN
DATE = YYYY-MM-DD
YEAR

Lookup functions, count functions and customisation of functions in data analysis

To explain Lookup functions, Count functions, and the Customization of functions in data analysis with practical examples, you can use a spreadsheet tool (e.g., Google Sheets, Excel) or a Python-based environment (e.g., Google Colab, Jupyter Notebook). Below is a step-by-step guide on how you can practically explain each of these concepts:

1. Lookup Functions: VLOOKUP, HLOOKUP, and Python Pandas Equivalent

Lookup functions allow you to search for specific data in a table or range and return a corresponding value from another column or row. These functions are extremely useful when you need to extract data based on specific criteria.

Example in Excel or Google Sheets:

VLOOKUP (Vertical Lookup): This searches for a value in the first column of a range and returns a value in the same row from another column.
Formula:
scss
Copy code
=VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup])

Example: Imagine you have a table with Employee IDs in Column A and Employee Names in Column B. To find the name of the employee with ID 102, you can use:
php
Copy code
=VLOOKUP(102, A1:B10, 2, FALSE)

HLOOKUP (Horizontal Lookup): Similar to VLOOKUP, but it searches for a value in the top row and returns a value in the same column from another row.

Practical Example in Python using Pandas:

In Python, you can achieve similar functionality using Pandas.

python

Copy code

import pandas as pd

# Create a sample dataset (similar to Excel table)

data = {'Employee_ID': [101, 102, 103, 104],

'Employee_Name': ['Alice', 'Bob', 'Charlie', 'Diana']}

# Convert to DataFrame

df = pd.DataFrame(data)

# Use a lookup function to find the employee name for Employee ID 102

employee_id = 102

employee_name = df.loc[df['Employee_ID'] == employee_id, 'Employee_Name'].values[0]

print(f'Employee with ID {employee_id} is {employee_name}')

This code will print:

csharp

Copy code

Employee with ID 102 is Bob

2. Count Functions: COUNT, COUNTIF, COUNTIFS in Excel/Sheets and Python Equivalent

Count functions are used to count the number of cells that contain data, meet certain criteria, or match specific conditions.

Example in Excel or Google Sheets:

COUNT: Counts the number of cells that contain numbers.
scss
Copy code
=COUNT(A1:A10)

COUNTIF: Counts the number of cells that meet a specific criterion.
less
Copy code
=COUNTIF(A1:A10, ">5")

COUNTIFS: Counts the number of cells that meet multiple criteria.
less
Copy code
=COUNTIFS(A1:A10, ">5", B1:B10, "<10")

Practical Example in Python using Pandas:

You can achieve the same functionality in Pandas.

python

Copy code

# Sample dataset

data = {'Product': ['A', 'B', 'A', 'B', 'C', 'A'],

'Sales': [5, 15, 10, 5, 25, 10]}

df = pd.DataFrame(data)

# COUNT: Total number of sales entries

count_sales = df['Sales'].count()

print(f"Total Sales Entries: {count_sales}")

# COUNTIF: Number of sales greater than 10

countif_sales = (df['Sales'] > 10).sum()

print(f"Sales > 10: {countif_sales}")

# COUNTIFS: Number of products 'A' with sales > 5

countifs = len(df[(df['Product'] == 'A') & (df['Sales'] > 5)])

print(f"Product A with Sales > 5: {countifs}")

Output:

yaml

Copy code

Total Sales Entries: 6

Sales > 10: 2

Product A with Sales > 5: 2

3. Customization of Functions: Custom Functions in Excel and Python

Custom functions allow you to create your own logic or operation. In Excel, you can use VBA (Visual Basic for Applications) for complex custom functions. In Python, you can define functions with custom logic using the def keyword.

Custom Function in Excel:

In Excel, you can create custom functions with VBA:

Open Excel.
Press Alt + F11 to open the VBA editor.
Insert a new Module.
Write a simple custom function, for example, to calculate the square of a number.

vba

Copy code

Function SquareNumber(num As Double) As Double

SquareNumber = num * num

End Function

Now, you can use =SquareNumber(A1) in your Excel sheet.

Custom Function in Python:

In Python, you can easily define a custom function. For example, to create a custom function that calculates the discounted price of a product:

python

Copy code

# Custom function to calculate discount

def calculate_discount(price, discount_percentage):

discount_amount = price * (discount_percentage / 100)

discounted_price = price - discount_amount

return discounted_price

# Test the function

price = 100

discount_percentage = 20

print(f"Discounted Price: {calculate_discount(price, discount_percentage)}")

Output:

yaml

Copy code

Discounted Price: 80.0

You can create custom functions for any business logic, such as calculating tax, applying conditions to data, or even complex transformations.

Practical Steps to Demonstrate These in a Session:

Start with Lookup Functions:
- Open Google Sheets or Excel.
- Load a dataset and demonstrate VLOOKUP by finding a value.
- Explain how Python's Pandas library can perform similar lookup tasks with .loc().
Move to Count Functions:
- Show how to use COUNT, COUNTIF, and COUNTIFS in Excel/Sheets.
- In Python, demonstrate how to count entries and apply conditional counts using Pandas.
Custom Functions:
- Show how to create custom functions in Excel using VBA.
- Then, define custom functions in Python and explain how it allows you to create reusable code.

Resources:

Google Sheets: Google Sheets
Excel VBA Custom Functions: Excel VBA Documentation
Python Pandas Documentation: Pandas
Google Colab: Google Colab

These practical examples give users hands-on experience with common data analysis functions while showing how to transition between spreadsheet-based analysis (Excel, Google Sheets) and Python-based analysis (Pandas).

Is excel a part of big data analysis, if not then why above question was explained through excel

Excel is not typically considered a primary tool for Big Data Analysis because it has limitations in handling large datasets, scalability, and advanced processing capabilities. Big Data refers to massive datasets that cannot be processed or analyzed using traditional data tools, like Excel, which struggles when the data size exceeds its row limit (approximately 1 million rows).

However, Excel is widely used for basic data analysis and small to medium-sized datasets in businesses and educational settings. The reason I explained your earlier question using Excel is because:

Excel is Familiar: Many people, especially in non-technical fields or those beginning data analysis, are familiar with Excel or Google Sheets. It's easier to understand fundamental data analysis concepts like lookup functions, count functions, and custom functions in Excel because of its simplicity and visual interface.
Demonstrating Concepts: The concepts of lookup, count, and customization of functions can be effectively demonstrated in Excel for teaching purposes. These concepts are also transferable to more advanced tools like Python and SQL when you're ready to scale up to Big Data environments.
Transitional Learning: By showing how certain analysis functions work in Excel, which is familiar, I can help build a bridge to more advanced tools used for Big Data (like Python, Pandas, Spark, etc.). Many beginners in data analysis start with Excel and gradually move to advanced programming languages and platforms.

Why Excel is Not Ideal for Big Data Analysis:

Data Size Limitations: Excel can handle only up to ~1 million rows, but Big Data usually involves millions or even billions of rows.
Processing Power: Excel lacks the computational power and speed to efficiently process large volumes of data. Tools like Hadoop, Spark, or databases like Google BigQuery are used in Big Data environments for their distributed processing capabilities.
Scalability and Automation: Big Data tools are designed to handle data across distributed systems and can easily scale. Excel doesn’t have the same level of automation or scalability.

When to Transition from Excel to Big Data Tools:

If your data exceeds Excel's limitations, or you're dealing with datasets that involve complex processing or advanced analytics (e.g., machine learning, real-time data streaming), it’s time to transition to Big Data platforms such as:

Python & Pandas: For advanced data processing.
SQL/NoSQL Databases: For querying large datasets.
Hadoop/Spark: For distributed data processing.
Cloud Platforms: AWS S3, Google BigQuery, or Azure for large data storage and analysis.

Conclusion:

While Excel is useful for small-scale data analysis, it's not suitable for Big Data. The earlier explanation was framed around Excel because the question related to fundamental data analysis concepts, which can be more easily understood and visualized in a simpler tool like Excel before moving on to more complex Big Data environments.

Exercise on Rainfall pattern created?

Step 1: Download the Dataset

Go to the website: https://data.gov.in
Search for a dataset. For simplicity, let’s use the following dataset:
- Dataset Name: State-wise Monthly Rainfall Data (2004-2020)
- Direct Link to Download: Rainfall Dataset
Download the Dataset:
- Click the Download button on the page.
- Save the file on your computer.
- The file name will likely be rainfall.csv.

The file name is D67. Rainfall Monthly Total_1_0.csv, so we'll proceed carefully using this exact file name. from google.colab import files
# Upload the fileuploaded = files.upload() import pandas as pd # Importing the Pandas library file_name = "D67. Rainfall Monthly Total_1_0.csv"data = pd.read_csv(file_name)
print("Dataset Overview:")print(data.head()) Step 4: Check for Missing Values and Dataset StructureThis step ensures that the data is clean and ready for analysis.

Copy and paste the following code into a new cell in Google Colab:

# Check the structure of the datasetprint("Column Names in the Dataset:")print(data.columns) # Displays the column names in the dataset
# Check for missing values in the datasetprint("\nMissing Values in Each Column:")print(data.isnull().sum()) # Shows the count of missing values for each column
# Display the total number of rows and columns in the datasetprint("\nDataset Dimensions:")print(f"Rows: {data.shape[0]}, Columns: {data.shape[1]}")

What to Expect

A list of all column names in the dataset.
A count of missing values in each column (if any).
The number of rows and columns in the dataset.

Step 5: Clean the Data (Handle Missing Values)If there are missing values in the dataset, we need to handle them. We’ll remove rows with missing values for simplicity.

Copy and paste the following code into a new cell in Google Colab:

# Drop rows with missing valuesdata_cleaned = data.dropna()
# Confirm that missing values have been removedprint("Missing Values After Cleaning:")print(data_cleaned.isnull().sum())
# Display the first few rows of the cleaned datasetprint("\nCleaned Dataset Overview:")print(data_cleaned.head()) What This Code Does

Removes rows with missing values using data.dropna().
Confirms that there are no missing values left.
Displays the first few rows of the cleaned dataset for verification.

Step 6: Analyze the Data (Calculate Average Rainfall by State)This step involves grouping the data by state and calculating the average annual rainfall for each state.

Copy and paste the following code into a new cell in Google Colab:

# Calculate the average annual rainfall for each stateaverage_rainfall = data_cleaned.groupby("State")["Annual Rainfall"].mean()
# Display the resultsprint("Average Annual Rainfall by State:")print(average_rainfall) Step 6 (Revised): Check Column Names

First, let’s confirm the exact column names in the dataset. Run the following code to display all the column names:

# Display all column names in the datasetprint("Column Names in the Dataset:")print(data_cleaned.columns)

Check the output carefully. Look for the column names that correspond to State and Annual Rainfall. For example, they could be:
- "State" might actually be " State" or "STATE".
- "Annual Rainfall" might be "Annual_Rainfall" or "Rainfall (mm)".

Step 6: Update the Column Names in the AnalysisOnce you know the exact column names, update the code to match them. Replace "State" and "Annual Rainfall" with the exact names from your dataset. For example:# Replace the column names below with the actual names from your datasetaverage_rainfall = data_cleaned.groupby("Exact_State_Column_Name")["Exact_Rainfall_Column_Name"].mean()
# Display the resultsprint("Average Annual Rainfall by State:")print(average_rainfall)
For instance, if the columns are " State" and "Rainfall (mm)", your code will look like this:
average_rainfall = data_cleaned.groupby(" State")["Rainfall (mm)"].mean()print("Average Annual Rainfall by State:")print(average_rainfall)

Step 6 (Revised): Calculate Average Rainfall

Copy and paste the following code into a new cell:

# Calculate the overall average rainfallaverage_rainfall = data_cleaned["Total_rainfall"].mean()
# Display the resultprint("Average Rainfall (across all months and years):")print(f"{average_rainfall:.2f} mm")
What This Code Does

Takes the 'Total_rainfall' column and calculates the mean value.
Displays the average rainfall rounded to 2 decimal places.

Step 7: Visualize Rainfall Trend Over Time

Copy and paste the following code into a new cell in Google Colab:

import matplotlib.pyplot as plt # Importing the plotting library
# Convert 'Month-Year' to datetime format for better visualizationdata_cleaned['Month-Year'] = pd.to_datetime(data_cleaned['Month-Year'])
# Plotting the rainfall trendplt.figure(figsize=(12, 6))plt.plot(data_cleaned['Month-Year'], data_cleaned['Total_rainfall'], marker='o', linestyle='-', color='b')
# Adding title and labelsplt.title("Rainfall Trend Over Time", fontsize=16)plt.xlabel("Month-Year", fontsize=12)plt.ylabel("Total Rainfall (mm)", fontsize=12)
# Improving x-axis readabilityplt.xticks(rotation=45)plt.grid(True)plt.tight_layout()
# Show the plotplt.show()
What This Code Does

Converts the 'Month-Year' column into a datetime format so that it can be plotted on the x-axis.
Creates a line plot showing how rainfall varies over time.
Adds markers, labels, and a title to make the visualization informative.

Step 1: Check the 'Month-Year' ColumnWe need to inspect the 'Month-Year' column to see what it contains.

Run this code to display unique values in the 'Month-Year' column:

# Display unique values in the 'Month-Year' columnprint("Unique values in 'Month-Year' column:")print(data_cleaned['Month-Year'].unique())

2. Check the output:

- Look for invalid, corrupted, or strange date formats (e.g., Jan-11 or empty strings).
- Confirm if all values are consistent (e.g., YYYY-MM or Month-Year).

Step 2: Fix or Format the 'Month-Year' ColumnIf the dates are in a format like Jan-11, we need to specify the correct date format. Use this code# Convert 'Month-Year' to datetime format, assuming it's in 'MMM-YY' (e.g., Jan-11)data_cleaned['Month-Year'] = pd.to_datetime(data_cleaned['Month-Year'], format='%b-%y', errors='coerce')
# Check if any invalid dates existprint("Invalid dates after conversion:")print(data_cleaned[data_cleaned['Month-Year'].isna()])
# Drop rows with invalid datesdata_cleaned = data_cleaned.dropna(subset=['Month-Year'])print("Conversion successful!")

Explanation of the Code

format='%b-%y': Tells Python to interpret the dates as abbreviated month and two-digit year (e.g., Jan-11 = January 2011).
errors='coerce': Replaces invalid date entries with NaT (Not a Time).
dropna(subset=['Month-Year']): Removes rows where the date could not be converted.

Next step
data_cleaned = data_cleaned.dropna(subset=['Month-Year']) Step 8: Visualize the Rainfall TrendNow, plot the data to see how rainfall trends over time.

# Plotting the rainfall trendplt.figure(figsize=(12, 6))plt.plot(data_cleaned['Month-Year'], data_cleaned['Total_rainfall'], marker='o', linestyle='-', color='b')
# Adding title and labelsplt.title("Rainfall Trend Over Time", fontsize=16)plt.xlabel("Month-Year", fontsize=12)plt.ylabel("Total Rainfall (mm)", fontsize=12)
# Improving x-axis readabilityplt.xticks(rotation=45)plt.grid(True)plt.tight_layout()
# Show the plotplt.show()

Exercise on Rainfall executed?

from google.colab import files

uploaded = files.upload()

D67. Rainfall Monthly Total_1_0.csv(text/csv) - 1513 bytes, last modified: 11/30/2024 - 100% done

Saving D67. Rainfall Monthly Total_1_0.csv to D67. Rainfall Monthly Total_1_0 (1).csv

import pandas as pd

file_name = "D67. Rainfall Monthly Total_1_0.csv"

data = pd.read_csv(file_name)

print("Dataset Overview:")

print(data.head())

Dataset Overview: Month-Year Total_rainfall0 Jan-11 0.01 Feb-11 14.02 Mar-11 0.03 Apr-11 0.04 May-11 0.0

print("Column Names in the Dataset:")

print(data.columns)

Column Names in the Dataset:Index(['Month-Year', 'Total_rainfall'], dtype='object')

print("\nMissing Values in Each Column:")

print(data.isnull().sum())

Missing Values in Each Column:Month-Year 0Total_rainfall 0dtype: int64

print("\nDataset Dimensions:")

print(f"Rows: {data.shape[0]}, Columns: {data.shape[1]}")

Dataset Dimensions:Rows: 132, Columns: 2

data_cleaned = data.dropna()

print("Missing Values After Cleaning:")

print(data_cleaned.isnull().sum())

Missing Values After Cleaning:Month-Year 0Total_rainfall 0dtype: int64

print("\nCleaned Dataset Overview:")

print(data_cleaned.head())

Cleaned Dataset Overview: Month-Year Total_rainfall0 Jan-11 0.01 Feb-11 14.02 Mar-11 0.03 Apr-11 0.04 May-11 0.0

print("Column Names in the Dataset:")

print(data_cleaned.columns)

Column Names in the Dataset:Index(['Month-Year', 'Total_rainfall'], dtype='object')

average_rainfall = data_cleaned["Total_rainfall"].mean()

print("Average Rainfall (across all months and years):")

print(f"{average_rainfall:.2f} mm")

Average Rainfall (across all months and years):57.00 mm

import matplotlib.pyplot as plt

data_cleaned['Month-Year'] = pd.to_datetime(data_cleaned['Month-Year'], format='%b-%y', errors='coerce')

print("Invalid dates after conversion:")

print(data_cleaned[data_cleaned['Month-Year'].isna()])

Invalid dates after conversion:Empty DataFrameColumns: [Month-Year, Total_rainfall]Index: []

print("Invalid dates after conversion:")

print(data_cleaned[data_cleaned['Month-Year'].isna()])

Invalid dates after conversion:Empty DataFrameColumns: [Month-Year, Total_rainfall]Index: []

data_cleaned = data_cleaned.dropna(subset=['Month-Year'])

print("Conversion successful!")

Conversion successful!

plt.figure(figsize=(12, 6))

plt.plot(data_cleaned['Month-Year'], data_cleaned['Total_rainfall'], marker='o', linestyle='-', color='b')

plt.title("Rainfall Trend Over Time", fontsize=16)

plt.xlabel("Month-Year", fontsize=12)

plt.ylabel("Total Rainfall (mm)", fontsize=12)

plt.xticks(rotation=45)

plt.grid(True)

plt.tight_layout()

Exercise on Rainfall executed 2.0?

from google.colab import files

uploaded = files.upload()

D67. Rainfall Monthly Total_1_0.csv(text/csv) - 1513 bytes, last modified: 11/30/2024 - 100% done

Saving D67. Rainfall Monthly Total_1_0.csv to D67. Rainfall Monthly Total_1_0.csv

import pandas as pd

file_name ="D67. Rainfall Monthly Total_1_0.csv"

data = pd.read_csv(file_name)

print("Dataset Overview:")

print(data.head())

Dataset Overview:

Month-Year Total_rainfall

0 Jan-11 0.0

1 Feb-11 14.0

2 Mar-11 0.0

3 Apr-11 0.0

4 May-11 0.0

print("Column Names in the Dataset:")

print(data.columns)

Column Names in the Dataset:

Index(['Month-Year', 'Total_rainfall'], dtype='object')

print("\nMissing Values in Each Column:")

print(data.isnull().sum())

Missing Values in Each Column:

Month-Year 0

Total_rainfall 0

dtype: int64

print("\nDataset Dimensions:")

print(f"Rows: {data.shape[0]}, Columns: {data.shape[1]}")

Dataset Dimensions:

Rows: 132, Columns: 2

data_cleaned = data.dropna()

print("Missing Values After Cleaning:")

print(data_cleaned.isnull().sum())

Missing Values After Cleaning:

Month-Year 0

Total_rainfall 0

dtype: int64

print("\nCleaned Dataset Overview:")

print(data_cleaned.head())

Cleaned Dataset Overview:

Month-Year Total_rainfall

0 Jan-11 0.0

1 Feb-11 14.0

2 Mar-11 0.0

3 Apr-11 0.0

4 May-11 0.0

print("Column Names in the Dataset:")

print(data_cleaned.columns)

Column Names in the Dataset:

Index(['Month-Year', 'Total_rainfall'], dtype='object')

average_rainfall = data_cleaned["Total_rainfall"].mean()

print("Average Rainfall (across all months and years):")

print(f"{average_rainfall:.2f} mm")

Average Rainfall (across all months and years):

57.00 mm

import matplotlib.pyplot as plt

data_cleaned['Month-Year'] = pd.to_datetime(data_cleaned['Month-Year'], format='%b-%y', errors='coerce')

print("Invalid dates after conversion:")

print(data_cleaned[data_cleaned['Month-Year'].isna()])

print("Invalid dates after conversion:")

print(data_cleaned[data_cleaned['Month-Year'].isna()])

Invalid dates after conversion:

Empty DataFrame

Columns: [Month-Year, Total_rainfall]

Index: []

data_cleaned = data_cleaned.dropna(subset=['Month-Year'])

print("Conversion successful!")

Conversion successful!

plt.figure(figsize=(12,6))

plt.plot(data_cleaned['Month-Year'], data_cleaned['Total_rainfall'], marker='o', linestyle='-', color='b')

plt.title("Rainfall Trend Over Time", fontsize=16)

plt.xlabel("Month-Year", fontsize=12)

plt.ylabel("Total Rainfall (mm)", fontsize=12)

plt.xticks(rotation=45)

plt.grid(True)

plt.tight_layout()

plt.show()

Exercise on Month, Region, Sales, Profit?

Step 1: Create a Custom CSV File

Copy and paste the following data into a plain text editor (e.g., Notepad):

Month,Region,Sales,Profit

January,North,20000,5000

January,South,18000,4500

January,East,22000,6000

January,West,15000,3000

February,North,25000,7000

February,South,20000,5000

February,East,27000,8000

February,West,16000,3500

March,North,30000,9000

March,South,22000,5500

March,East,29000,8500

March,West,17000,4000

2. Save it as a CSV file:

- File Name: sales_data.csv

Step 2: Upload the CSV File to Google Colab

from google.colab import files

# Upload the CSV file

uploaded = files.upload()

What to Expect

Once uploaded, Colab will display:

Uploading sales_data.csv

Saving sales_data.csv to sales_data.csv

Step 3: Load and Display the Dataset

import pandas as pd # Importing Pandas for data handling

# Load the uploaded CSV file into a Pandas DataFrame

file_name = "sales_data.csv"

data = pd.read_csv(file_name)

# Display the first few rows of the dataset

print("Dataset Overview:")

print(data.head())

What to Expect

You should see the first 5 rows of the dataset, with columns Month, Region, Sales, and Profit.

Step 4: Visualize Sales by Region

import matplotlib.pyplot as plt

# Group the data by Region and sum the Sales

region_sales = data.groupby("Region")["Sales"].sum()

# Plot a bar chart

plt.figure(figsize=(8, 6))

region_sales.plot(kind="bar", color="skyblue", edgecolor="black")

# Add title and labels

plt.title("Total Sales by Region", fontsize=16)

plt.xlabel("Region", fontsize=12)

plt.ylabel("Total Sales (in $)", fontsize=12)

# Show the plot

plt.grid(axis='y', linestyle='--', alpha=0.7)

plt.tight_layout()

plt.show()

What to Expect

A bar chart showing total sales for each region (North, South, East, West).

Step 5: Visualize Profit Trends by Month

# Group the data by Month and sum the Profit

month_profit = data.groupby("Month")["Profit"].sum()

# Sort the months in calendar order

month_order = ["January", "February", "March"]

month_profit = month_profit.reindex(month_order)

# Plot a line chart

plt.figure(figsize=(8, 6))

month_profit.plot(kind="line", marker='o', color="green", linewidth=2)

# Add title and labels

plt.title("Profit Trends by Month", fontsize=16)

plt.xlabel("Month", fontsize=12)

plt.ylabel("Total Profit (in $)", fontsize=12)

# Show the plot

plt.grid(axis='y', linestyle='--', alpha=0.7)

plt.tight_layout()

plt.show()

What This Code Does

Groups the data by Month and calculates the total Profit for each month.
Sorts the months in the correct calendar order (January → February → March).
Creates a line chart with green markers and lines to represent the profit trend.

Exercise on Vehicle Pollution data?

Step 1: Create a Dataset

Open Notepad and paste the following data:

Day,Intersection,Vehicles Counted,Average Speed

Monday,Intersection A,1200,45

Monday,Intersection B,800,50

Monday,Intersection C,1500,40

Tuesday,Intersection A,1300,42

Tuesday,Intersection B,900,48

Tuesday,Intersection C,1400,38

Wednesday,Intersection A,1250,43

Wednesday,Intersection B,850,46

Wednesday,Intersection C,1350,39

Thursday,Intersection A,1400,41

Thursday,Intersection B,920,47

Thursday,Intersection C,1450,37

Friday,Intersection A,1500,40

Friday,Intersection B,980,45

Friday,Intersection C,1600,35

Saturday,Intersection A,1700,38

Saturday,Intersection B,1100,42

Saturday,Intersection C,1800,33

Sunday,Intersection A,1600,39

Sunday,Intersection B,1050,44

Sunday,Intersection C,1750,34

Save the file as traffic_data.csv on your desktop.

Step 2: Upload the CSV File to Google Colab

Open Google Colab and create a new notebook.

from google.colab import files

# Upload the CSV file

uploaded = files.upload()

A file upload dialog will appear. Select the file traffic_data.csv from your desktop and upload it.

What to Expect

Once uploaded, Colab will display:
Uploading traffic_data.csv Saving traffic_data.csv to traffic_data.csv

traffic_data.csv(text/csv) - 717 bytes, last modified: 12/1/2024 - 100% done

Saving traffic_data.csv to traffic_data.csv

Step 3: Load and Display the Dataset

import pandas as pd # Importing Pandas for data handling

# Load the uploaded CSV file into a Pandas DataFrame

file_name = "traffic_data.csv"

data = pd.read_csv(file_name)

# Display the first few rows of the dataset

print("Dataset Overview:")

print(data.head())

Step 4: Total Vehicles Counted by Day

Let’s calculate the total number of vehicles counted for each day and create a bar chart.

import matplotlib.pyplot as plt

# Group data by Day and sum the Vehicles Counted

daywise_traffic = data.groupby("Day")["Vehicles Counted"].sum()

# Plot a bar chart

plt.figure(figsize=(8, 6))

daywise_traffic.plot(kind="bar", color="skyblue", edgecolor="black")

# Add title and labels

plt.title("Total Vehicles Counted by Day", fontsize=16)

plt.xlabel("Day", fontsize=12)

plt.ylabel("Total Vehicles", fontsize=12)

# Show the plot

plt.grid(axis='y', linestyle='--', alpha=0.7)

plt.tight_layout()

plt.show()

What This Code Does

Groups the data by Day and calculates the total number of vehicles for each day.
Creates a bar chart to visualize traffic patterns throughout the week.

What to Expect

A bar chart showing total vehicle counts for each day (Monday to Sunday)

Step 5: Analyze Traffic for Odd and Even Days

We’ll classify the days into Odd and Even, calculate the total vehicle count for each category, and create a comparative bar chart.

# Classify days into Odd and Even based on their position in the week

odd_days = ["Monday", "Wednesday", "Friday", "Sunday"]

data["Day Type"] = data["Day"].apply(lambda x: "Odd" if x in odd_days else "Even")

# Group data by Day Type and sum the Vehicles Counted

odd_even_traffic = data.groupby("Day Type")["Vehicles Counted"].sum()

# Plot a bar chart for Odd and Even days

plt.figure(figsize=(8, 6))

odd_even_traffic.plot(kind="bar", color=["lightcoral", "skyblue"], edgecolor="black")

# Add title and labels

plt.title("Traffic Comparison: Odd vs Even Days", fontsize=16)

plt.xlabel("Day Type", fontsize=12)

plt.ylabel("Total Vehicles", fontsize=12)

# Show the plot

plt.grid(axis='y', linestyle='--', alpha=0.7)

plt.tight_layout()

plt.show()

What This Code Does

Classifies days into Odd (Monday, Wednesday, Friday, Sunday) and Even (Tuesday, Thursday, Saturday).
Groups data by these categories and calculates the total vehicle count for each.
Creates a bar chart to compare traffic between Odd and Even days.

What to Expect

A bar chart comparing total vehicle counts for Odd and Even days.
Highlights whether traffic is heavier on odd or even days, which is critical for policies like Odd-Even Rules.

odd_days = ["Monday", "Wednesday", "Friday", "Sunday"]

data["Day Type"] = data["Day"].apply(lambda x: "Odd" if x in odd_days else "Even")

odd_even_traffic = data.groupby("Day Type")["Vehicles Counted"].sum()

plt.figure(figsize=(8,6))

odd_even_traffic.plot(kind="bar", color=["lightcoral", "skyblue"], edgecolor="black")

plt.title("Traffic Comparison:odd vs Even Days", fontsize=16)

plt.xlabel("Day Type", fontsize=12)

plt.ylabel("Total Vehicles", fontsize=12)

plt.grid(axis='y', linestyle='--', alpha=0.7)

plt.tight_layout()

plt.show()

Exercise which is common?

Open Government Data (OGD) Platform India as our primary resource. Here's how you can proceed:

Step 1: Access the OGD Platform

Visit the OGD Platform: Open your web browser and go to data.gov.in.

Step 2: Search for a Dataset

Use the Search Function: On the homepage, you'll find a search bar. Type in keywords related to the data you're interested in, such as "rainfall," "population," or "traffic."
Filter Results: After searching, you can filter the results by file format (e.g., CSV) to find datasets that are easy to work with.

Step 3: Select and Download a Dataset

Choose a Dataset: Browse through the search results and select a dataset that fits your needs. For example, you might choose "State-wise Monthly Rainfall Data."
Download the Dataset: On the dataset's page, look for a download link or button, often labeled "Download" or "CSV." Click it to download the file to your computer.

Step 4: Upload the Dataset to Google Colab

Open Google Colab: Navigate to Google Colab and create a new notebook.
Upload the File:
- In Colab, click on the folder icon on the left sidebar to open the file explorer.
- Click the upload icon (a file with an upward arrow) and select the CSV file you downloaded.

Step 5: Load and Explore the Data

Import Necessary Libraries:

import pandas as pd

Load the Data:

# Replace 'filename.csv' with the actual name of your file data = pd.read_csv('filename.csv')

View the Data:

# Display the first few rows of the dataset

data.head()

Step 6: Analyze and Visualize the Data

Check for Missing Values:

data.isnull().sum()

Basic Statistics:

data.describe()

Data Visualization:
- import matplotlib.pyplot as plt
- Plot Data (example for rainfall data):
- # Ensure 'Date' column is in datetime format

data['Date'] = pd.to_datetime(data['Date'])

# Plot

plt.figure(figsize=(10, 5))

plt.plot(data['Date'], data['Rainfall'], marker='o')

plt.title('Monthly Rainfall Over Time')

plt.xlabel('Date')

plt.ylabel('Rainfall (mm)')

plt.grid(True)

plt.show()

Additional Resources

Smart Cities Mission Data Portal: For city-specific data, visit the Smart Cities Mission Data Portal.

NCRB

NCRB data

Let’s begin working on the NCRB_CII-2019_Table_17A.1.csv file you downloaded.

Step 1: Upload the CSV File to Google Colab

Open Google Colab and create a new notebook.
Add the following code in a new cell to upload the file:
from google.colab import files # Upload the CSV file uploaded = files.upload()
Run the cell by pressing Shift + Enter.
A file upload dialog will appear. Select the file NCRB_CII-2019_Table_17A.1.csv from your computer and upload it.

What to Expect

Once uploaded, Colab should display something like this:

Uploading NCRB_CII-2019_Table_17A.1.csv Saving NCRB_CII-2019_Table_17A.1.csv to NCRB_CII-2019_Table_17A.1.csv

NCRB_CII-2019_Table_17A.1.csv

NCRB_CII-2019_Table_17A.1.csv(text/csv) - 18087 bytes, last modified: 12/1/2024 - 100% done

Saving NCRB_CII-2019_Table_17A.1.csv to NCRB_CII-2019_Table_17A.1.csv

Step 2: Load and Display the Dataset

Add the following code to a new cell in Google Colab:

import pandas as pd

# Importing Pandas for data handling

# Load the uploaded CSV file into a Pandas DataFrame

file_name = "NCRB_CII-2019_Table_17A.1.csv"

data = pd.read_csv(file_name)

# Display the first few rows of the dataset

print("Dataset Overview:")

print(data.head())

Run the cell by pressing Shift + Enter.

What to Expect

You should see the first 5 rows of the dataset, including the column names and some data.

Step 4: Analyze the Dataset

Let’s start with a simple analysis: Identify the top 5 crime categories with the highest number of cases reported during the year.

# Sort the dataset by 'Cases Reported during the year' in descending order

top_crimes = data.sort_values(by='Cases Reported during the year', ascending=False)

# Select the top 5 rows

top_5_crimes = top_crimes[['Crime Head', 'Cases Reported during the year']].head(5)

# Display the results

print("Top 5 Crime Categories by Cases Reported:")

print(top_5_crimes)

A list of the top 5 crime categories with the highest number of cases reported during the year.

Top 5 Crime Categories by Cases Reported:

Crime Head Cases Reported during the year

141 Total Cognizable IPC crimes 3225701

60 Offences Affecting the Human Body (Total) 1050945

139 Miscellaneous IPC Crimes(Total) 860209

104 Offences against Property (Total) 854618

88 Theft 675916

Step 5: Visualize the Top 5 Crime Categories by Cases Reported

import matplotlib.pyplot as plt

# Extract the crime categories and case numbers

crime_categories = top_5_crimes['Crime Head']

case_numbers = top_5_crimes['Cases Reported during the year']

# Create the bar chart

plt.figure(figsize=(10, 6))

plt.barh(crime_categories, case_numbers, color='skyblue', edgecolor='black')

# Add title and labels

plt.title("Top 5 Crime Categories by Cases Reported", fontsize=16)

plt.xlabel("Number of Cases", fontsize=12)

plt.ylabel("Crime Categories", fontsize=12)

# Show the plot

plt.tight_layout()

plt.show()

What to Expect

A horizontal bar chart showcasing the top 5 crime categories with their respective number of cases reported during the year.

import matplotlib.pyplot as plt

crime_categories = top_5_crimes['Crime Head']

case_numbers = top_5_crimes['Cases Reported during the year']

plt.figure(figsize=(10, 6))

plt.barh(crime_categories, case_numbers, color='skyblue', edgecolor='black')

plt.title("Top 5 Crime Categories by Cases Reported", fontsize=16)

plt.xlabel("Number of Cases", fontsize=12)

plt.ylabel("Crime Categories", fontsize=12)

plt.tight_layout()

plt.show()

A simple plot of a chef?

import matplotlib.pyplot as plt

# Your dishes and their quantities

dishes = ["Pizza", "Burger", "Pasta"]

quantities = [5, 3, 7]

# Create a horizontal bar chart

plt.barh(dishes, quantities, color='skyblue')

# Add a title and labels

plt.title("Dish Quantities at Buffet")

plt.xlabel("Quantity (kg)")

plt.ylabel("Dishes")

# Show the chart

plt.show()

CRPF csv file & Predictions

Copy and paste the following in notepad and save it as .csv

CRPF_Incidents_Large.csv

Incident_ID,Location,Date,Incident_Type,Casualties,Reported_By,Severity

1,Assam,2024-08-23,Encounter,0,Patrol Unit B,Low

2,Odisha,2024-04-10,Riots,0,Unit D,Low

3,Odisha,2024-06-10,Encounter,2,Unit D,Medium

4,West Bengal,2024-07-27,Patrolling,6,Unit C,Medium

5,Jammu & Kashmir,2024-07-22,Patrolling,0,Patrol Unit A,Medium

6,Maharashtra,2024-08-08,IED Blast,4,Unit C,Low

7,Bihar,2024-10-30,Encounter,7,Local Intel,Low

8,Assam,2024-07-19,Search Operation,9,Unit D,Low

9,Assam,2024-07-14,Riots,5,Local Intel,Medium

10,Kerala,2024-12-06,Encounter,8,Unit D,Medium

11,Odisha,2025-01-13,Search Operation,1,Patrol Unit A,Medium

12,Bihar,2024-10-08,Search Operation,8,Patrol Unit B,High

13,Assam,2024-06-01,Encounter,5,Local Intel,High

14,Odisha,2024-11-26,Encounter,8,Unit C,Low

15,Bihar,2024-07-24,Search Operation,6,Unit D,High

16,West Bengal,2024-11-29,Search Operation,2,Patrol Unit A,Low

17,Kerala,2024-10-22,Riots,4,Unit C,Medium

18,Jammu & Kashmir,2024-03-04,IED Blast,3,Patrol Unit B,High

19,Assam,2024-09-06,Patrolling,4,Unit C,Low

20,Chhattisgarh,2024-07-29,Encounter,7,Local Intel,High

21,Assam,2024-05-01,Riots,7,Unit C,Low

22,West Bengal,2024-06-08,Search Operation,10,Patrol Unit A,Medium

23,Jammu & Kashmir,2024-05-14,Ambush,2,Local Intel,High

24,Jammu & Kashmir,2024-07-02,IED Blast,2,Unit C,Low

25,Odisha,2024-11-04,Ambush,8,Local Intel,Low

26,Bihar,2024-09-28,IED Blast,7,Patrol Unit A,High

27,Chhattisgarh,2024-06-12,IED Blast,4,Local Intel,Low

28,Maharashtra,2024-04-26,Patrolling,2,Unit C,Medium

29,Jammu & Kashmir,2024-08-13,Encounter,10,Unit C,Low

30,Maharashtra,2024-09-16,Encounter,10,Patrol Unit A,High

31,Odisha,2024-02-09,Search Operation,2,Unit D,Low

32,Maharashtra,2024-02-11,Encounter,8,Unit D,High

33,Bihar,2025-01-02,Encounter,10,Unit D,Low

34,Odisha,2024-08-27,Search Operation,4,Unit D,Medium

35,Bihar,2024-07-26,Search Operation,7,Patrol Unit B,Low

36,Kerala,2024-09-29,Search Operation,0,Local Intel,High

37,Jammu & Kashmir,2024-09-14,Encounter,4,Local Intel,High

38,Maharashtra,2024-10-09,Patrolling,10,Unit C,Medium

39,Maharashtra,2024-10-17,Patrolling,9,Local Intel,Low

40,Kerala,2024-02-09,Patrolling,3,Unit D,Medium

41,Maharashtra,2024-10-06,Ambush,0,Patrol Unit A,Low

42,Jammu & Kashmir,2024-02-29,Search Operation,8,Unit D,Medium

43,Odisha,2024-11-07,IED Blast,1,Unit D,Medium

44,West Bengal,2024-02-03,Riots,5,Local Intel,Medium

45,Maharashtra,2024-11-26,Ambush,2,Local Intel,Low

46,Kerala,2025-01-14,Encounter,6,Patrol Unit B,Medium

47,Assam,2024-05-20,Riots,2,Patrol Unit A,Low

48,Bihar,2024-04-14,Encounter,7,Local Intel,High

49,Bihar,2024-05-22,Encounter,10,Patrol Unit A,Medium

50,West Bengal,2024-07-07,Encounter,9,Unit C,Low

51,Chhattisgarh,2024-07-29,Search Operation,1,Unit D,Medium

52,Bihar,2024-11-18,Riots,6,Patrol Unit B,Medium

53,Assam,2024-09-09,Encounter,4,Patrol Unit B,Medium

54,West Bengal,2025-01-02,Search Operation,8,Patrol Unit A,Medium

55,Chhattisgarh,2024-03-04,Search Operation,9,Patrol Unit B,Medium

56,Jammu & Kashmir,2024-06-25,Ambush,0,Patrol Unit B,Medium

57,Maharashtra,2024-05-22,Riots,2,Local Intel,Medium

58,Assam,2024-08-17,Riots,5,Patrol Unit B,High

59,Assam,2024-11-02,Search Operation,4,Unit C,Medium

60,Assam,2024-04-16,Encounter,6,Patrol Unit B,Medium

61,Maharashtra,2024-10-09,Patrolling,6,Patrol Unit B,Low

62,Kerala,2024-07-06,IED Blast,7,Unit D,Low

63,Maharashtra,2024-06-29,IED Blast,1,Unit D,High

64,Chhattisgarh,2024-08-09,Patrolling,7,Patrol Unit A,Low

65,Bihar,2024-06-17,Ambush,5,Patrol Unit A,Low

66,Chhattisgarh,2024-07-05,Patrolling,4,Unit C,Low

67,West Bengal,2024-09-25,Patrolling,9,Unit C,Medium

68,Odisha,2024-09-26,Encounter,8,Local Intel,Medium

69,Assam,2024-03-29,Encounter,7,Unit D,Medium

70,Odisha,2024-12-05,IED Blast,3,Patrol Unit B,High

71,West Bengal,2024-05-29,IED Blast,0,Local Intel,Low

72,Odisha,2024-02-18,IED Blast,10,Unit C,Low

73,Jammu & Kashmir,2025-01-25,Riots,7,Local Intel,High

74,Maharashtra,2024-03-03,Patrolling,2,Patrol Unit A,Low

75,Assam,2024-02-27,Patrolling,5,Unit D,Low

76,Jammu & Kashmir,2024-10-28,Search Operation,5,Patrol Unit B,Medium

77,Maharashtra,2025-01-26,Search Operation,2,Patrol Unit A,High

78,Assam,2024-08-31,Encounter,6,Patrol Unit B,Low

79,Chhattisgarh,2024-03-02,IED Blast,10,Unit C,High

80,Chhattisgarh,2024-05-09,Riots,7,Patrol Unit A,High

81,Bihar,2024-10-16,Encounter,7,Patrol Unit B,Medium

82,Bihar,2024-10-12,Ambush,8,Patrol Unit A,Low

83,Chhattisgarh,2024-12-05,Search Operation,0,Unit D,Low

84,Bihar,2024-05-06,Riots,2,Unit D,Low

85,Bihar,2024-05-12,Riots,6,Unit C,Low

86,Chhattisgarh,2024-03-01,Search Operation,3,Patrol Unit B,High

87,West Bengal,2024-11-05,IED Blast,1,Patrol Unit A,Medium

88,Jammu & Kashmir,2024-01-29,Riots,10,Patrol Unit B,High

89,Assam,2024-05-13,Patrolling,10,Patrol Unit A,Low

90,Assam,2024-10-08,Ambush,0,Unit C,Low

91,Jammu & Kashmir,2024-06-22,IED Blast,10,Unit C,High

92,Assam,2025-01-08,Patrolling,8,Patrol Unit B,High

93,Assam,2025-01-22,IED Blast,2,Local Intel,High

94,Maharashtra,2024-10-06,Patrolling,7,Local Intel,High

95,Jammu & Kashmir,2024-04-11,IED Blast,4,Local Intel,Medium

96,Bihar,2024-09-16,Ambush,8,Unit D,Low

97,Bihar,2024-10-20,Patrolling,2,Unit C,Low

98,West Bengal,2024-01-30,Encounter,9,Patrol Unit B,Low

99,Chhattisgarh,2024-07-26,Riots,4,Unit C,High

100,Kerala,2024-03-12,Ambush,6,Local Intel,Low

101,West Bengal,2024-09-20,IED Blast,10,Local Intel,Low

102,Bihar,2024-02-21,Riots,2,Patrol Unit A,Low

103,Assam,2025-01-13,Riots,0,Unit D,Medium

104,Chhattisgarh,2025-01-13,Patrolling,3,Unit D,Low

105,Odisha,2024-11-28,IED Blast,2,Unit C,Low

106,Jammu & Kashmir,2024-10-17,Ambush,8,Patrol Unit A,Medium

107,Bihar,2024-05-16,IED Blast,9,Patrol Unit A,Low

108,West Bengal,2024-03-12,Encounter,6,Local Intel,High

109,Odisha,2024-05-28,Encounter,4,Unit C,Medium

110,Jammu & Kashmir,2024-09-24,Encounter,1,Local Intel,Low

111,West Bengal,2024-03-30,Search Operation,4,Unit C,Low

112,West Bengal,2024-03-18,Patrolling,8,Unit D,Low

113,West Bengal,2024-03-01,Riots,8,Patrol Unit A,Medium

114,Jammu & Kashmir,2024-08-05,Encounter,6,Patrol Unit B,High

115,Maharashtra,2024-11-02,Search Operation,7,Patrol Unit B,Medium

116,West Bengal,2024-08-21,Encounter,7,Patrol Unit B,High

117,Odisha,2024-02-29,IED Blast,9,Unit C,Medium

118,Odisha,2024-10-22,Search Operation,5,Patrol Unit B,Medium

119,Jammu & Kashmir,2024-01-30,Patrolling,4,Unit C,High

120,Odisha,2024-02-13,Patrolling,5,Local Intel,Low

121,West Bengal,2024-03-14,IED Blast,9,Patrol Unit A,Medium

122,Kerala,2024-05-15,Search Operation,0,Local Intel,Medium

123,Chhattisgarh,2024-02-14,Riots,6,Patrol Unit A,Medium

124,Odisha,2025-01-15,Riots,1,Local Intel,Low

125,Chhattisgarh,2024-10-04,Patrolling,0,Patrol Unit B,High

126,Bihar,2025-01-18,Search Operation,8,Unit D,High

127,Bihar,2025-01-11,IED Blast,3,Unit D,Low

128,Assam,2024-04-25,Patrolling,8,Local Intel,High

129,Kerala,2024-11-21,Encounter,3,Patrol Unit A,Medium

130,Bihar,2024-04-01,Patrolling,7,Unit D,Medium

131,Assam,2024-11-16,Ambush,1,Unit D,High

132,West Bengal,2024-10-13,IED Blast,1,Local Intel,Low

133,Odisha,2024-03-07,Patrolling,2,Patrol Unit A,Low

134,West Bengal,2024-03-09,Riots,10,Unit C,Medium

135,Bihar,2024-10-18,Riots,2,Unit D,Low

136,Assam,2024-08-14,Riots,3,Local Intel,High

137,Jammu & Kashmir,2024-07-30,Ambush,6,Unit D,Low

138,Kerala,2024-11-04,Search Operation,1,Local Intel,High

139,Odisha,2024-08-26,Riots,1,Patrol Unit A,High

140,Odisha,2024-02-13,IED Blast,9,Patrol Unit A,High

141,West Bengal,2024-05-24,Search Operation,1,Patrol Unit A,Low

142,West Bengal,2024-07-31,Encounter,1,Patrol Unit B,Medium

143,West Bengal,2024-12-08,Ambush,3,Unit C,High

144,West Bengal,2024-06-29,Search Operation,4,Local Intel,High

145,Bihar,2024-02-01,Patrolling,4,Unit C,Medium

146,Odisha,2024-07-04,Search Operation,1,Patrol Unit B,High

147,Maharashtra,2024-08-31,IED Blast,8,Local Intel,High

148,Assam,2025-01-01,Search Operation,3,Local Intel,Medium

149,Maharashtra,2024-04-12,Riots,1,Patrol Unit A,Medium

150,Odisha,2024-03-04,Riots,5,Patrol Unit B,High

151,Kerala,2024-08-22,Patrolling,10,Patrol Unit B,Medium

152,Kerala,2024-06-25,IED Blast,10,Unit C,Low

153,Bihar,2024-04-05,Search Operation,4,Patrol Unit B,Low

154,Bihar,2025-01-08,Search Operation,10,Unit D,Medium

155,Bihar,2024-05-25,Riots,3,Unit C,Medium

156,Assam,2024-07-22,Encounter,2,Unit C,High

157,West Bengal,2024-02-24,Search Operation,1,Local Intel,Medium

158,Maharashtra,2024-08-23,Ambush,10,Local Intel,Medium

159,Jammu & Kashmir,2024-11-05,Ambush,1,Patrol Unit B,Medium

160,Assam,2024-05-02,Patrolling,3,Local Intel,High

161,Chhattisgarh,2024-10-06,Search Operation,8,Patrol Unit A,Low

162,Maharashtra,2024-05-28,Search Operation,5,Local Intel,Low

163,Jammu & Kashmir,2024-11-07,Search Operation,8,Patrol Unit B,Low

164,Odisha,2024-11-29,Patrolling,9,Local Intel,Medium

165,West Bengal,2024-04-20,Patrolling,3,Patrol Unit A,High

166,Chhattisgarh,2025-01-13,Search Operation,8,Local Intel,Low

167,West Bengal,2024-09-23,Riots,10,Patrol Unit B,Medium

168,Kerala,2024-12-31,Search Operation,4,Patrol Unit A,Medium

169,Kerala,2024-06-04,Patrolling,6,Patrol Unit A,Medium

170,Odisha,2024-09-23,Encounter,7,Patrol Unit B,Low

171,Kerala,2024-02-05,Encounter,0,Local Intel,High

172,Jammu & Kashmir,2024-04-12,Encounter,1,Local Intel,Medium

173,Chhattisgarh,2024-08-26,Encounter,0,Patrol Unit A,High

174,Assam,2024-02-09,IED Blast,6,Patrol Unit A,Medium

175,Jammu & Kashmir,2024-09-06,Riots,6,Patrol Unit B,High

176,West Bengal,2024-02-15,Search Operation,5,Unit C,High

177,Odisha,2024-04-10,Riots,0,Unit C,Medium

178,Maharashtra,2024-10-12,Patrolling,9,Local Intel,Low

179,Maharashtra,2024-09-09,Riots,4,Unit D,Medium

180,Kerala,2024-12-04,Riots,5,Patrol Unit A,Medium

181,Bihar,2024-04-09,IED Blast,6,Unit C,Medium

182,Odisha,2024-10-09,Riots,2,Patrol Unit B,Medium

183,Chhattisgarh,2024-03-03,Search Operation,3,Unit D,High

184,Jammu & Kashmir,2024-01-27,Riots,10,Patrol Unit A,Low

185,West Bengal,2024-02-29,Patrolling,1,Unit D,High

186,Chhattisgarh,2024-04-11,Patrolling,1,Local Intel,Low

187,Assam,2024-10-17,Patrolling,1,Local Intel,High

188,Assam,2025-01-20,Search Operation,3,Patrol Unit B,Low

189,Jammu & Kashmir,2024-11-11,IED Blast,0,Local Intel,Medium

190,Jammu & Kashmir,2024-07-26,Patrolling,9,Patrol Unit B,High

191,Assam,2024-04-20,IED Blast,1,Unit C,Low

192,Assam,2024-02-05,Patrolling,8,Unit C,Medium

193,Maharashtra,2024-09-13,IED Blast,1,Patrol Unit B,Medium

194,Kerala,2024-12-03,Search Operation,10,Local Intel,High

195,Maharashtra,2024-01-30,Patrolling,9,Patrol Unit A,Low

196,Kerala,2024-02-12,Patrolling,4,Unit C,Medium

197,Maharashtra,2025-01-18,Patrolling,9,Patrol Unit B,High

198,West Bengal,2024-03-21,Search Operation,10,Local Intel,Low

199,Odisha,2024-10-19,Encounter,5,Patrol Unit A,High

200,Kerala,2024-05-11,Riots,8,Patrol Unit B,Low

201,Jammu & Kashmir,2025-01-14,Riots,0,Unit D,Low

202,Kerala,2024-06-06,Patrolling,5,Unit C,High

203,Bihar,2024-07-11,Ambush,9,Unit D,Medium

204,Maharashtra,2024-12-22,Search Operation,8,Unit D,Low

205,Jammu & Kashmir,2024-11-04,Ambush,0,Local Intel,Medium

206,Jammu & Kashmir,2024-11-04,Patrolling,9,Patrol Unit B,Medium

207,Bihar,2024-09-28,IED Blast,10,Local Intel,High

208,Maharashtra,2024-05-02,IED Blast,10,Unit D,Low

209,Chhattisgarh,2024-08-08,Patrolling,2,Unit C,Medium

210,Chhattisgarh,2024-12-03,Encounter,4,Unit C,High

211,Kerala,2024-05-21,Patrolling,0,Local Intel,Medium

212,Kerala,2025-01-21,Patrolling,1,Local Intel,Low

213,Odisha,2024-08-19,Search Operation,2,Local Intel,High

214,Chhattisgarh,2024-02-04,Riots,10,Unit D,Medium

215,West Bengal,2024-06-05,IED Blast,8,Patrol Unit A,Low

216,Jammu & Kashmir,2024-05-02,Riots,0,Unit C,High

217,Jammu & Kashmir,2024-08-26,Search Operation,6,Unit D,Medium

218,Odisha,2024-05-26,Encounter,0,Unit D,High

219,Bihar,2024-10-30,Riots,1,Unit D,Medium

220,Odisha,2024-10-13,Search Operation,6,Unit C,High

221,Odisha,2024-04-16,Encounter,10,Patrol Unit A,Low

222,Chhattisgarh,2024-04-18,Patrolling,4,Patrol Unit B,High

223,West Bengal,2024-09-05,Patrolling,6,Local Intel,High

224,Maharashtra,2024-09-14,Search Operation,9,Patrol Unit B,Medium

225,Maharashtra,2024-06-24,IED Blast,7,Unit C,Low

226,Chhattisgarh,2024-06-03,Search Operation,7,Patrol Unit A,High

227,Bihar,2024-04-19,Ambush,10,Patrol Unit B,Low

228,Assam,2024-09-09,Encounter,3,Patrol Unit A,Medium

229,Maharashtra,2024-08-06,Riots,4,Unit C,Low

230,Kerala,2024-05-17,Encounter,5,Local Intel,Medium

231,Chhattisgarh,2024-12-01,Ambush,3,Local Intel,Medium

232,Maharashtra,2024-02-25,Patrolling,7,Unit D,Low

233,Odisha,2024-07-02,Ambush,5,Unit D,Medium

234,Assam,2024-04-09,Search Operation,10,Unit C,Low

235,West Bengal,2024-09-05,IED Blast,0,Patrol Unit B,High

236,Assam,2025-01-26,Encounter,6,Patrol Unit A,High

237,Odisha,2024-11-15,Riots,1,Patrol Unit B,Medium

238,Chhattisgarh,2024-02-15,Search Operation,0,Patrol Unit B,High

239,Odisha,2024-04-08,Encounter,4,Unit C,Low

240,Jammu & Kashmir,2024-08-26,Patrolling,6,Unit C,Medium

241,Bihar,2024-05-05,Ambush,3,Unit C,Low

242,Jammu & Kashmir,2025-01-04,Riots,10,Patrol Unit A,Medium

243,Maharashtra,2024-02-27,IED Blast,6,Patrol Unit B,High

244,Assam,2024-08-01,Search Operation,3,Unit D,Medium

245,Assam,2024-09-05,Patrolling,8,Unit D,Medium

246,Maharashtra,2024-07-16,Search Operation,3,Unit D,Medium

247,Kerala,2024-06-07,Encounter,10,Local Intel,Low

248,Chhattisgarh,2024-05-09,Patrolling,2,Local Intel,Low

249,Chhattisgarh,2024-03-22,Riots,7,Patrol Unit B,Medium

250,Kerala,2024-11-04,Encounter,9,Patrol Unit A,Low

251,Jammu & Kashmir,2024-10-23,Riots,10,Patrol Unit A,Medium

252,Kerala,2024-06-18,Ambush,7,Unit D,Low

253,Kerala,2024-03-05,Riots,2,Patrol Unit B,Medium

254,Jammu & Kashmir,2024-12-13,Ambush,4,Unit C,High

255,Kerala,2024-04-29,Search Operation,4,Unit C,High

256,Odisha,2024-02-27,Riots,6,Unit C,High

257,Odisha,2024-05-14,IED Blast,10,Unit D,Low

258,Kerala,2024-03-07,Search Operation,3,Patrol Unit A,Low

259,Chhattisgarh,2024-09-04,IED Blast,8,Unit C,Medium

260,Odisha,2024-04-01,IED Blast,10,Unit C,Medium

261,Assam,2024-02-26,Patrolling,1,Unit D,High

262,Kerala,2024-02-27,Patrolling,2,Patrol Unit B,Medium

263,West Bengal,2024-06-25,Patrolling,10,Local Intel,Medium

264,Jammu & Kashmir,2024-12-15,Ambush,1,Unit D,High

265,Kerala,2024-09-14,Search Operation,4,Patrol Unit A,Medium

266,Assam,2024-01-29,Riots,8,Patrol Unit A,Medium

267,Assam,2025-01-22,IED Blast,5,Unit C,Low

268,Kerala,2024-07-10,Patrolling,3,Patrol Unit B,Low

269,Assam,2024-06-11,Search Operation,0,Unit C,High

270,Assam,2024-10-20,IED Blast,1,Unit C,Medium

271,Kerala,2024-09-17,Ambush,4,Unit C,Medium

272,West Bengal,2024-04-09,Encounter,5,Patrol Unit B,Low

273,Odisha,2024-12-29,Encounter,3,Patrol Unit B,High

274,Kerala,2024-12-08,Ambush,7,Unit C,Medium

275,Maharashtra,2025-01-16,Ambush,7,Unit D,Medium

276,Maharashtra,2024-07-21,Ambush,4,Local Intel,Medium

277,Maharashtra,2024-08-24,Patrolling,4,Unit D,Medium

278,Chhattisgarh,2024-05-17,Encounter,7,Unit D,High

279,West Bengal,2024-09-05,IED Blast,9,Unit C,High

280,Kerala,2024-12-16,Search Operation,8,Patrol Unit B,High

281,Odisha,2024-05-24,Patrolling,9,Patrol Unit B,Low

282,Jammu & Kashmir,2024-08-06,IED Blast,1,Local Intel,High

283,Chhattisgarh,2024-03-14,Riots,4,Patrol Unit B,Medium

284,Chhattisgarh,2024-06-30,Ambush,1,Unit D,Medium

285,Maharashtra,2024-09-06,Encounter,7,Unit D,Low

286,Chhattisgarh,2024-10-26,Patrolling,0,Patrol Unit A,High

287,Maharashtra,2024-04-21,Search Operation,5,Unit D,Medium

288,Kerala,2024-11-07,Patrolling,4,Unit C,High

289,Kerala,2024-09-20,Ambush,1,Local Intel,High

290,West Bengal,2024-11-15,Patrolling,2,Unit D,Medium

291,Odisha,2024-12-14,Riots,5,Unit C,Low

292,Bihar,2024-09-07,IED Blast,10,Unit C,Medium

293,Kerala,2024-12-21,Riots,2,Patrol Unit B,Low

294,Maharashtra,2024-08-28,Search Operation,3,Unit D,Low

295,Odisha,2025-01-18,Patrolling,3,Local Intel,Medium

296,Maharashtra,2024-10-05,Patrolling,1,Local Intel,High

297,Chhattisgarh,2024-06-01,Patrolling,0,Patrol Unit A,Medium

298,West Bengal,2024-08-01,Patrolling,4,Patrol Unit B,High

299,Bihar,2025-01-11,Encounter,4,Patrol Unit B,Medium

300,Odisha,2024-04-24,Patrolling,10,Unit C,Low

301,Chhattisgarh,2024-09-14,Ambush,10,Unit D,High

302,Chhattisgarh,2024-10-06,Search Operation,7,Unit D,Medium

303,Chhattisgarh,2024-08-26,IED Blast,10,Patrol Unit A,High

304,Maharashtra,2024-09-05,Patrolling,10,Unit C,Medium

305,West Bengal,2025-01-15,Patrolling,10,Unit C,Medium

306,Jammu & Kashmir,2024-06-14,Ambush,10,Unit C,Low

307,Bihar,2024-09-07,Patrolling,6,Unit C,Low

308,West Bengal,2024-02-19,Ambush,5,Patrol Unit A,High

309,Odisha,2024-12-01,Encounter,0,Unit C,Low

310,Jammu & Kashmir,2024-03-29,Riots,8,Unit C,Low

311,West Bengal,2024-09-08,Patrolling,1,Patrol Unit B,Low

312,Kerala,2024-05-24,Search Operation,5,Local Intel,High

313,Chhattisgarh,2024-05-04,IED Blast,7,Unit C,Low

314,Odisha,2024-04-13,Search Operation,6,Local Intel,Low

315,Chhattisgarh,2024-03-19,Encounter,3,Local Intel,Medium

316,Maharashtra,2025-01-08,Ambush,9,Local Intel,Low

317,Kerala,2024-08-01,Search Operation,10,Unit C,Low

318,West Bengal,2024-07-24,Patrolling,10,Patrol Unit A,Low

319,Kerala,2024-08-31,IED Blast,3,Unit D,Low

320,Kerala,2024-02-27,Ambush,3,Unit C,High

321,West Bengal,2024-12-27,Ambush,9,Local Intel,Medium

322,Assam,2024-02-29,Ambush,1,Local Intel,Low

323,Kerala,2024-12-31,Ambush,2,Unit D,Low

324,Kerala,2024-03-11,IED Blast,8,Unit C,Medium

325,Assam,2024-12-21,Encounter,4,Patrol Unit A,High

326,Odisha,2024-04-19,Patrolling,10,Local Intel,Low

327,Kerala,2024-05-03,Encounter,4,Patrol Unit A,Medium

328,Assam,2024-04-20,Patrolling,2,Unit D,Medium

329,Chhattisgarh,2024-10-03,Encounter,8,Unit D,High

330,Chhattisgarh,2025-01-02,IED Blast,7,Unit C,Low

331,Chhattisgarh,2024-08-19,IED Blast,5,Patrol Unit A,High

332,Bihar,2024-08-22,Encounter,10,Patrol Unit B,Low

333,Bihar,2024-03-25,Riots,6,Patrol Unit A,Low

334,Kerala,2024-01-30,Riots,7,Unit D,High

335,Maharashtra,2024-08-15,IED Blast,8,Patrol Unit A,High

336,Jammu & Kashmir,2024-04-27,Patrolling,0,Local Intel,Low

337,Bihar,2024-10-01,Search Operation,8,Unit D,Low

338,Odisha,2024-05-22,IED Blast,1,Unit D,High

339,Jammu & Kashmir,2024-03-11,Ambush,8,Patrol Unit A,High

340,Maharashtra,2024-08-22,Riots,8,Unit C,Low

341,Assam,2024-08-04,Riots,3,Patrol Unit B,High

342,Odisha,2024-11-03,Ambush,10,Local Intel,Medium

343,West Bengal,2024-02-26,Search Operation,10,Patrol Unit A,High

344,Jammu & Kashmir,2024-11-16,Search Operation,4,Patrol Unit A,Low

345,Odisha,2024-09-23,Ambush,5,Local Intel,Medium

346,Kerala,2024-07-31,IED Blast,7,Unit C,Medium

347,Odisha,2024-04-05,IED Blast,10,Patrol Unit B,Low

348,Assam,2024-10-29,IED Blast,3,Patrol Unit A,Low

349,Jammu & Kashmir,2024-09-18,IED Blast,3,Patrol Unit B,Medium

350,Jammu & Kashmir,2024-05-06,Search Operation,6,Local Intel,Medium

351,Jammu & Kashmir,2024-12-31,Patrolling,10,Local Intel,High

352,Chhattisgarh,2025-01-23,Riots,0,Patrol Unit A,High

353,Kerala,2024-06-06,Ambush,10,Patrol Unit A,High

354,Maharashtra,2024-09-13,IED Blast,5,Unit D,High

355,Kerala,2024-10-07,Search Operation,4,Local Intel,Medium

356,Jammu & Kashmir,2024-08-04,Search Operation,3,Unit C,High

357,Maharashtra,2024-08-13,Patrolling,9,Patrol Unit A,Medium

358,Maharashtra,2024-04-01,Patrolling,1,Patrol Unit A,High

359,Assam,2025-01-06,Riots,9,Unit C,High

360,Bihar,2025-01-05,Search Operation,9,Patrol Unit A,High

361,Bihar,2024-09-11,Search Operation,2,Unit D,High

362,Maharashtra,2024-06-15,Ambush,2,Patrol Unit B,High

363,West Bengal,2024-07-19,IED Blast,1,Unit D,Low

364,Chhattisgarh,2024-04-24,Riots,9,Patrol Unit A,Medium

365,West Bengal,2024-02-01,Riots,1,Local Intel,Low

366,Kerala,2024-11-24,Riots,0,Unit C,Low

367,Kerala,2024-10-15,Patrolling,2,Patrol Unit B,Low

368,Assam,2024-12-27,Riots,6,Unit C,High

369,Odisha,2024-07-11,Ambush,8,Unit D,Medium

370,West Bengal,2024-09-30,Encounter,6,Local Intel,High

371,Maharashtra,2024-11-29,Ambush,2,Unit D,Low

372,Maharashtra,2024-07-29,Riots,8,Unit C,High

373,Odisha,2024-07-09,Patrolling,5,Patrol Unit B,Low

374,Bihar,2024-03-19,Search Operation,1,Patrol Unit B,High

375,Kerala,2024-02-14,Search Operation,4,Local Intel,High

376,Maharashtra,2024-07-21,Patrolling,9,Unit C,High

377,Assam,2024-02-04,Search Operation,7,Local Intel,Low

378,Maharashtra,2024-06-07,IED Blast,2,Unit C,Medium

379,Chhattisgarh,2024-02-20,Ambush,2,Local Intel,High

380,Assam,2024-11-13,Riots,9,Local Intel,Low

381,West Bengal,2024-09-17,Ambush,2,Unit D,High

382,West Bengal,2024-03-04,IED Blast,9,Unit C,Low

383,Kerala,2024-09-16,Ambush,3,Patrol Unit B,Medium

384,Assam,2024-11-11,Search Operation,1,Patrol Unit A,Medium

385,Odisha,2024-10-13,Patrolling,9,Local Intel,High

386,Chhattisgarh,2025-01-21,Search Operation,2,Unit D,High

387,Jammu & Kashmir,2024-03-27,Search Operation,1,Patrol Unit A,High

388,Bihar,2024-04-28,Ambush,3,Patrol Unit B,High

389,Jammu & Kashmir,2024-09-28,Ambush,1,Local Intel,High

390,Chhattisgarh,2024-07-11,Ambush,2,Patrol Unit B,High

391,Odisha,2024-05-08,Search Operation,7,Patrol Unit B,Low

392,Kerala,2024-09-21,Riots,0,Patrol Unit B,High

393,Jammu & Kashmir,2024-11-03,Encounter,0,Unit C,Medium

394,Chhattisgarh,2024-05-09,Encounter,6,Patrol Unit B,High

395,Odisha,2024-04-07,Patrolling,8,Patrol Unit B,Low

396,Bihar,2024-04-17,Ambush,6,Unit C,Low

397,Bihar,2024-05-18,IED Blast,10,Patrol Unit A,Medium

398,Maharashtra,2024-08-08,Encounter,5,Patrol Unit A,Low

399,Bihar,2024-03-13,Encounter,2,Unit D,High

400,Odisha,2024-03-10,Encounter,8,Local Intel,High

401,Jammu & Kashmir,2024-07-22,Encounter,4,Unit D,Low

402,Jammu & Kashmir,2024-05-17,Patrolling,7,Local Intel,Medium

403,Bihar,2024-06-04,IED Blast,0,Patrol Unit B,High

404,Odisha,2024-08-01,Search Operation,2,Local Intel,High

405,Jammu & Kashmir,2025-01-09,Encounter,10,Patrol Unit B,Low

406,Kerala,2024-04-17,Search Operation,10,Unit D,High

407,Jammu & Kashmir,2024-12-13,Encounter,2,Patrol Unit A,Medium

408,Maharashtra,2024-07-16,Encounter,2,Unit D,Medium

409,Odisha,2025-01-03,Riots,1,Unit D,High

410,Maharashtra,2024-12-23,IED Blast,3,Patrol Unit A,High

411,Jammu & Kashmir,2024-02-05,Search Operation,1,Unit D,Medium

412,Maharashtra,2025-01-22,Encounter,9,Patrol Unit B,High

413,Jammu & Kashmir,2024-10-02,IED Blast,4,Patrol Unit A,Medium

414,Jammu & Kashmir,2024-04-24,IED Blast,6,Patrol Unit A,Low

415,Odisha,2024-08-06,IED Blast,9,Local Intel,Medium

416,Chhattisgarh,2024-02-19,IED Blast,2,Unit D,Low

417,Bihar,2024-07-09,Search Operation,3,Local Intel,Low

418,Jammu & Kashmir,2024-02-14,IED Blast,2,Patrol Unit B,Low

419,Assam,2024-12-22,Ambush,8,Patrol Unit B,Low

420,Odisha,2024-04-12,Riots,6,Patrol Unit B,High

421,Jammu & Kashmir,2024-03-11,Riots,10,Local Intel,High

422,Odisha,2024-02-09,Riots,10,Unit C,Medium

423,Jammu & Kashmir,2024-08-28,Search Operation,7,Patrol Unit B,Low

424,Bihar,2024-02-12,Encounter,1,Local Intel,Medium

425,Assam,2024-08-14,IED Blast,8,Local Intel,High

426,Odisha,2024-10-13,IED Blast,1,Patrol Unit B,High

427,West Bengal,2025-01-18,Patrolling,8,Patrol Unit B,Medium

428,West Bengal,2024-04-20,Encounter,2,Patrol Unit A,High

429,Kerala,2024-06-21,Ambush,5,Local Intel,High

430,Chhattisgarh,2024-02-25,Riots,4,Patrol Unit A,High

431,West Bengal,2024-04-17,Encounter,5,Patrol Unit A,Medium

432,Maharashtra,2024-05-31,Search Operation,6,Patrol Unit B,Medium

433,Chhattisgarh,2024-11-15,Riots,3,Unit C,High

434,Assam,2024-12-20,Ambush,5,Unit D,Medium

435,Assam,2024-04-16,Riots,0,Local Intel,Medium

436,Bihar,2024-10-20,Encounter,5,Local Intel,Medium

437,Odisha,2024-06-06,IED Blast,2,Patrol Unit B,High

438,Assam,2024-04-26,Riots,6,Patrol Unit A,Low

439,Maharashtra,2024-03-05,Encounter,7,Unit C,Low

440,Odisha,2024-10-13,Ambush,3,Patrol Unit A,High

441,Maharashtra,2024-09-12,Riots,4,Patrol Unit B,High

442,Assam,2024-04-30,IED Blast,3,Patrol Unit B,Medium

443,Kerala,2024-03-21,Riots,6,Unit D,High

444,Assam,2024-07-29,Riots,5,Local Intel,Low

445,West Bengal,2024-02-07,Encounter,3,Unit D,Low

446,Assam,2024-12-07,Patrolling,2,Local Intel,Medium

447,Bihar,2024-07-08,IED Blast,8,Local Intel,Medium

448,Assam,2024-03-19,Search Operation,10,Patrol Unit B,High

449,Chhattisgarh,2024-08-08,Ambush,4,Unit C,Medium

450,Chhattisgarh,2024-07-30,Ambush,3,Unit D,Medium

451,Assam,2024-05-12,Ambush,7,Patrol Unit A,Low

452,Odisha,2024-05-31,Patrolling,9,Patrol Unit A,Medium

453,Maharashtra,2024-02-26,IED Blast,9,Local Intel,High

454,Jammu & Kashmir,2024-09-20,IED Blast,1,Local Intel,Low

455,Bihar,2024-11-19,Search Operation,9,Patrol Unit A,High

456,Jammu & Kashmir,2024-10-01,Riots,0,Unit D,High

457,Assam,2024-12-12,Patrolling,9,Patrol Unit A,High

458,Odisha,2024-03-09,IED Blast,2,Patrol Unit A,Medium

459,Assam,2024-05-20,IED Blast,4,Patrol Unit B,High

460,Maharashtra,2024-08-04,Search Operation,7,Patrol Unit B,Low

461,Kerala,2024-06-07,Patrolling,10,Patrol Unit A,Medium

462,Odisha,2024-09-26,Search Operation,1,Local Intel,Low

463,Kerala,2024-10-07,Patrolling,7,Unit D,High

464,Odisha,2024-05-13,Encounter,5,Patrol Unit B,Medium

465,Chhattisgarh,2024-02-15,Ambush,7,Unit D,High

466,Maharashtra,2024-04-10,Encounter,5,Unit C,Low

467,Jammu & Kashmir,2024-12-27,Riots,1,Unit D,Medium

468,Chhattisgarh,2024-10-02,Patrolling,10,Patrol Unit B,High

469,Bihar,2024-02-16,Encounter,1,Patrol Unit A,Medium

470,Bihar,2024-05-30,Encounter,5,Unit C,Low

471,Jammu & Kashmir,2024-05-23,Riots,1,Local Intel,Low

472,Bihar,2024-06-27,IED Blast,3,Patrol Unit B,Medium

473,Bihar,2024-10-31,IED Blast,7,Unit C,Medium

474,Jammu & Kashmir,2024-06-12,Riots,8,Patrol Unit A,High

475,Jammu & Kashmir,2024-12-07,Ambush,6,Local Intel,Medium

476,Maharashtra,2024-07-06,Encounter,8,Patrol Unit B,High

477,Odisha,2024-07-31,Ambush,4,Unit D,Medium

478,Odisha,2024-12-20,Patrolling,1,Unit C,Low

479,Assam,2024-05-21,Patrolling,6,Unit C,Low

480,Bihar,2024-07-15,IED Blast,6,Unit C,Medium

481,West Bengal,2024-06-13,Riots,3,Patrol Unit A,Low

482,Odisha,2024-09-02,IED Blast,3,Patrol Unit B,High

483,Jammu & Kashmir,2024-12-31,Patrolling,1,Local Intel,Medium

484,Bihar,2024-04-10,IED Blast,9,Local Intel,Low

485,Odisha,2024-07-22,Patrolling,0,Unit C,High

486,Bihar,2024-10-08,Ambush,2,Unit D,High

487,Kerala,2024-12-14,IED Blast,6,Unit C,Low

488,Jammu & Kashmir,2024-11-10,Patrolling,7,Local Intel,Medium

489,Assam,2024-07-17,Patrolling,4,Unit D,Medium

490,Maharashtra,2024-05-18,Patrolling,4,Local Intel,Low

491,Jammu & Kashmir,2024-04-01,Patrolling,1,Unit D,Medium

492,Kerala,2024-06-01,Ambush,5,Unit D,Low

493,West Bengal,2024-11-23,Riots,6,Unit D,Medium

494,Assam,2024-12-30,Encounter,6,Patrol Unit B,High

495,West Bengal,2024-05-16,Patrolling,7,Patrol Unit B,Medium

496,Assam,2024-12-01,IED Blast,9,Patrol Unit A,Medium

497,Odisha,2024-09-16,Riots,0,Patrol Unit B,Medium

498,Kerala,2024-06-15,IED Blast,3,Patrol Unit B,Medium

499,Chhattisgarh,2024-11-23,Search Operation,9,Patrol Unit A,High

500,Jammu & Kashmir,2024-02-29,Encounter,6,Unit C,Medium

501,West Bengal,2024-11-19,Search Operation,3,Patrol Unit B,High

502,Maharashtra,2024-02-21,Encounter,10,Local Intel,Low

503,Kerala,2024-11-24,Ambush,10,Unit D,Medium

504,Odisha,2024-12-31,Patrolling,5,Unit D,Medium

505,Bihar,2024-04-06,Patrolling,7,Patrol Unit B,High

506,Maharashtra,2024-10-28,Search Operation,7,Patrol Unit B,Low

507,Chhattisgarh,2024-09-08,Search Operation,2,Patrol Unit B,Low

508,Jammu & Kashmir,2024-11-13,Encounter,6,Unit C,High

509,Assam,2024-10-26,Search Operation,9,Unit C,Medium

510,Bihar,2024-10-18,Encounter,9,Unit C,Low

511,Assam,2024-02-25,Search Operation,0,Unit D,High

512,Kerala,2024-03-25,Ambush,5,Patrol Unit A,High

513,Assam,2024-08-14,Riots,2,Unit D,Medium

514,Assam,2024-10-29,Patrolling,8,Unit D,Low

515,Odisha,2024-08-17,Patrolling,0,Patrol Unit A,Low

516,Odisha,2024-06-12,Patrolling,4,Patrol Unit A,Medium

517,Kerala,2024-12-26,Encounter,7,Unit C,Low

518,Jammu & Kashmir,2024-04-24,Ambush,6,Unit D,Medium

519,Chhattisgarh,2025-01-17,IED Blast,8,Unit D,Medium

520,Odisha,2024-04-26,Riots,10,Local Intel,High

521,Odisha,2024-07-25,Ambush,1,Patrol Unit A,High

522,Assam,2024-08-13,IED Blast,6,Local Intel,High

523,Chhattisgarh,2024-07-15,Search Operation,1,Local Intel,High

524,Jammu & Kashmir,2024-03-12,Patrolling,6,Unit D,Low

525,Bihar,2024-04-25,Encounter,7,Unit C,High

526,West Bengal,2024-10-18,Encounter,1,Local Intel,Low

527,Assam,2024-06-10,Ambush,6,Unit D,High

528,Jammu & Kashmir,2024-10-03,Search Operation,4,Patrol Unit A,High

529,Assam,2024-04-08,Ambush,2,Unit D,High

530,Jammu & Kashmir,2025-01-15,Riots,1,Local Intel,Low

531,Odisha,2024-05-25,Encounter,8,Unit C,High

532,Maharashtra,2024-09-24,IED Blast,5,Unit C,High

533,Bihar,2025-01-04,Ambush,3,Unit C,High

534,Bihar,2024-11-25,Encounter,10,Unit D,Low

535,Maharashtra,2024-09-29,IED Blast,1,Patrol Unit A,High

536,Assam,2024-05-18,IED Blast,9,Unit C,Low

537,Odisha,2024-03-05,Encounter,5,Local Intel,High

538,Maharashtra,2025-01-03,Encounter,5,Unit C,Medium

539,Assam,2024-08-29,Riots,6,Unit D,High

540,Odisha,2024-09-11,Riots,4,Unit D,Medium

541,Jammu & Kashmir,2024-04-24,Riots,3,Local Intel,Low

542,Chhattisgarh,2024-10-17,Patrolling,2,Unit C,High

543,Chhattisgarh,2024-04-22,Ambush,8,Local Intel,High

544,Odisha,2024-03-13,Ambush,5,Unit C,Low

545,Bihar,2025-01-03,Ambush,2,Patrol Unit B,High

546,Chhattisgarh,2024-12-03,Search Operation,9,Local Intel,Low

547,West Bengal,2024-11-15,Ambush,1,Unit D,High

548,Bihar,2024-12-10,Encounter,3,Local Intel,Medium

549,Chhattisgarh,2024-03-01,Patrolling,3,Local Intel,High

550,Assam,2024-05-16,Encounter,5,Patrol Unit B,Medium

551,Assam,2024-06-02,Encounter,2,Patrol Unit A,Medium

552,Kerala,2024-02-23,Encounter,6,Unit D,Low

553,Assam,2024-06-18,Patrolling,1,Unit C,Medium

554,Bihar,2024-07-20,Encounter,3,Unit D,High

555,West Bengal,2024-07-12,IED Blast,10,Patrol Unit A,Low

556,West Bengal,2024-05-14,IED Blast,2,Unit C,High

557,Maharashtra,2024-04-02,Ambush,0,Patrol Unit A,Medium

558,Maharashtra,2024-06-12,Encounter,2,Unit C,Low

559,Jammu & Kashmir,2024-06-12,Ambush,8,Patrol Unit A,High

560,Kerala,2024-07-24,IED Blast,3,Patrol Unit B,High

561,West Bengal,2024-03-27,Riots,0,Unit D,Medium

562,Odisha,2025-01-01,Ambush,4,Unit C,High

563,Kerala,2024-06-03,Encounter,8,Unit C,High

564,Odisha,2024-02-11,Patrolling,3,Local Intel,High

565,West Bengal,2024-07-21,Riots,10,Unit C,Medium

566,Maharashtra,2024-08-22,Encounter,3,Unit C,Low

567,Chhattisgarh,2024-07-30,Ambush,1,Local Intel,Low

568,West Bengal,2024-08-18,Search Operation,5,Patrol Unit B,Low

569,Assam,2024-09-22,IED Blast,10,Unit C,Low

570,Assam,2024-10-14,IED Blast,1,Local Intel,Low

571,Odisha,2024-09-23,IED Blast,9,Patrol Unit B,Medium

572,Jammu & Kashmir,2024-04-11,Patrolling,7,Unit D,Low

573,Jammu & Kashmir,2024-08-05,Patrolling,0,Local Intel,High

574,Chhattisgarh,2024-10-24,Encounter,8,Unit C,Low

575,Kerala,2024-05-30,Encounter,0,Patrol Unit B,Medium

576,Bihar,2024-09-01,IED Blast,2,Patrol Unit B,Low

577,Odisha,2024-02-10,Riots,2,Local Intel,Low

578,Chhattisgarh,2024-08-02,Search Operation,7,Patrol Unit B,High

579,Jammu & Kashmir,2024-03-24,Search Operation,2,Patrol Unit A,Low

580,Kerala,2024-02-23,Search Operation,0,Local Intel,Medium

581,Maharashtra,2025-01-16,Patrolling,9,Unit D,Medium

582,Odisha,2024-05-08,Search Operation,0,Patrol Unit B,Low

583,Assam,2025-01-21,IED Blast,0,Unit D,Medium

584,Chhattisgarh,2024-07-05,Patrolling,6,Local Intel,Low

585,Kerala,2025-01-08,IED Blast,8,Patrol Unit A,Medium

586,Chhattisgarh,2024-05-20,Search Operation,5,Unit C,High

587,Jammu & Kashmir,2025-01-22,Encounter,8,Unit D,Medium

588,Kerala,2024-11-10,Encounter,0,Patrol Unit B,Low

589,Bihar,2024-07-04,IED Blast,4,Unit D,Low

590,West Bengal,2024-03-07,Ambush,1,Unit D,Low

591,West Bengal,2024-12-18,Encounter,4,Patrol Unit B,High

592,Maharashtra,2024-07-18,Patrolling,6,Unit D,Low

593,West Bengal,2024-10-17,Patrolling,1,Unit C,High

594,Odisha,2024-05-19,Patrolling,5,Unit D,Medium

595,West Bengal,2024-09-21,Ambush,6,Patrol Unit B,Low

596,Chhattisgarh,2024-09-28,Patrolling,9,Patrol Unit B,Medium

597,Odisha,2024-04-04,Patrolling,0,Unit D,Medium

598,Chhattisgarh,2024-12-15,Ambush,4,Patrol Unit A,High

599,Odisha,2024-04-16,Riots,8,Unit C,Low

600,Assam,2024-10-31,Search Operation,6,Local Intel,High

601,Assam,2024-04-25,IED Blast,7,Patrol Unit B,High

602,West Bengal,2024-05-05,Encounter,9,Patrol Unit A,Low

603,Assam,2024-02-09,Patrolling,7,Patrol Unit A,High

604,Assam,2024-06-28,Riots,7,Unit C,High

605,Jammu & Kashmir,2024-05-11,Encounter,0,Unit C,Low

606,Jammu & Kashmir,2024-06-09,Riots,0,Patrol Unit B,Low

607,Maharashtra,2024-10-01,IED Blast,4,Local Intel,High

608,Assam,2025-01-01,Patrolling,6,Local Intel,Low

609,Kerala,2024-09-29,IED Blast,1,Unit D,Medium

610,Odisha,2024-12-14,Riots,3,Local Intel,High

611,Maharashtra,2024-08-06,Patrolling,8,Patrol Unit B,Medium

612,Odisha,2024-06-17,Search Operation,1,Local Intel,Medium

613,Maharashtra,2024-12-23,Encounter,3,Patrol Unit B,Medium

614,Jammu & Kashmir,2024-04-08,Search Operation,2,Patrol Unit A,Low

615,Kerala,2024-08-25,Riots,3,Local Intel,Low

616,Chhattisgarh,2024-12-19,Riots,10,Patrol Unit B,High

617,Maharashtra,2024-07-29,Ambush,0,Local Intel,High

618,West Bengal,2024-12-30,Patrolling,1,Patrol Unit B,High

619,Bihar,2024-06-18,Encounter,0,Unit D,Medium

620,Assam,2024-03-18,Ambush,5,Unit D,Low

621,Odisha,2024-08-19,Patrolling,9,Local Intel,Low

622,Odisha,2024-12-20,Search Operation,6,Unit C,Low

623,Assam,2024-04-17,Encounter,7,Local Intel,High

624,Maharashtra,2024-06-09,Search Operation,1,Patrol Unit A,Medium

625,Chhattisgarh,2025-01-06,Patrolling,9,Local Intel,High

626,West Bengal,2024-08-13,Encounter,0,Unit C,Medium

627,Maharashtra,2025-01-05,Patrolling,9,Patrol Unit A,High

628,Kerala,2024-10-23,Patrolling,10,Local Intel,High

629,Chhattisgarh,2025-01-25,IED Blast,10,Local Intel,Medium

630,Odisha,2024-04-02,IED Blast,10,Local Intel,Medium

631,Assam,2024-05-29,Encounter,0,Unit C,Medium

632,Jammu & Kashmir,2024-02-02,Riots,6,Unit C,Medium

633,Jammu & Kashmir,2024-11-19,Encounter,10,Unit D,High

634,Maharashtra,2025-01-21,Riots,7,Local Intel,Medium

635,Jammu & Kashmir,2024-03-09,Ambush,7,Unit D,Low

636,Jammu & Kashmir,2024-12-15,Search Operation,6,Patrol Unit B,Medium

637,Maharashtra,2024-12-11,Patrolling,3,Patrol Unit A,High

638,Bihar,2024-04-03,Ambush,3,Patrol Unit B,High

639,Odisha,2024-10-25,Search Operation,7,Unit D,Medium

640,Assam,2024-10-10,IED Blast,7,Patrol Unit A,High

641,Bihar,2024-06-16,Encounter,6,Unit C,Low

642,Assam,2024-11-16,Encounter,7,Patrol Unit A,High

643,Odisha,2024-08-27,Riots,10,Patrol Unit A,Low

644,Bihar,2024-03-26,IED Blast,8,Local Intel,High

645,Assam,2024-10-01,Riots,0,Unit C,High

646,Odisha,2024-03-24,Patrolling,9,Unit C,Medium

647,Kerala,2024-05-31,Riots,10,Unit C,Low

648,Assam,2024-03-06,Search Operation,6,Patrol Unit A,Medium

649,Jammu & Kashmir,2024-08-17,Riots,5,Patrol Unit B,High

650,Odisha,2024-12-06,Ambush,5,Patrol Unit B,Low

651,Maharashtra,2024-09-22,IED Blast,1,Unit D,High

652,Maharashtra,2024-12-31,Encounter,9,Local Intel,Medium

653,Kerala,2024-06-19,Patrolling,6,Local Intel,High

654,Odisha,2025-01-17,IED Blast,5,Unit D,High

655,Kerala,2024-04-02,Search Operation,10,Local Intel,Medium

656,West Bengal,2024-03-22,Encounter,10,Patrol Unit B,Low

657,Maharashtra,2025-01-14,Encounter,0,Local Intel,High

658,Odisha,2024-04-24,Riots,1,Unit D,High

659,Maharashtra,2024-11-05,IED Blast,8,Unit D,Low

660,West Bengal,2024-07-11,Ambush,5,Patrol Unit B,Medium

661,West Bengal,2024-03-04,Encounter,2,Unit D,Medium

662,Assam,2024-10-21,Search Operation,2,Patrol Unit A,Low

663,Kerala,2024-06-15,Search Operation,8,Unit D,Medium

664,West Bengal,2024-09-29,Patrolling,1,Patrol Unit A,High

665,Bihar,2024-04-09,Riots,3,Patrol Unit A,Medium

666,Maharashtra,2024-04-17,Ambush,3,Patrol Unit A,High

667,Maharashtra,2024-07-18,Riots,5,Local Intel,Medium

668,Assam,2024-02-21,Patrolling,3,Patrol Unit B,High

669,Jammu & Kashmir,2024-08-05,IED Blast,1,Unit C,High

670,West Bengal,2024-09-24,Patrolling,6,Patrol Unit A,Medium

671,Jammu & Kashmir,2024-09-12,Search Operation,6,Local Intel,Medium

672,West Bengal,2024-09-08,IED Blast,1,Unit D,Medium

673,Kerala,2024-11-24,Ambush,3,Unit C,Medium

674,Chhattisgarh,2024-02-05,Ambush,4,Local Intel,Medium

675,Odisha,2024-05-12,Ambush,7,Unit C,Low

676,Odisha,2024-08-24,Encounter,0,Unit D,Low

677,Bihar,2024-03-28,IED Blast,5,Patrol Unit A,Low

678,Assam,2024-11-02,IED Blast,4,Patrol Unit B,High

679,Kerala,2024-02-06,Ambush,0,Unit C,Low

680,Chhattisgarh,2024-09-01,Search Operation,6,Local Intel,High

681,Jammu & Kashmir,2024-10-26,Riots,1,Unit C,Low

682,Jammu & Kashmir,2024-09-14,Search Operation,9,Local Intel,Low

683,Jammu & Kashmir,2024-02-14,Search Operation,10,Patrol Unit A,Low

684,West Bengal,2024-12-27,Ambush,9,Unit D,Medium

685,Assam,2024-02-29,Encounter,9,Unit D,Medium

686,Bihar,2024-03-10,Patrolling,0,Unit C,Low

687,Jammu & Kashmir,2024-03-04,Search Operation,9,Patrol Unit B,High

688,West Bengal,2024-03-29,Riots,8,Local Intel,Low

689,Maharashtra,2024-05-05,Encounter,7,Unit C,Medium

690,Jammu & Kashmir,2024-03-21,Encounter,4,Patrol Unit A,Medium

691,Odisha,2025-01-01,Patrolling,0,Patrol Unit B,Low

692,Maharashtra,2024-04-15,Encounter,5,Unit C,High

693,Kerala,2024-08-24,Riots,8,Unit C,High

694,Assam,2025-01-10,Patrolling,3,Patrol Unit A,Medium

695,Bihar,2024-10-11,Encounter,5,Unit C,Medium

696,Kerala,2024-06-19,Patrolling,6,Patrol Unit B,Medium

697,Bihar,2024-11-29,Riots,3,Patrol Unit A,High

698,Chhattisgarh,2024-09-02,Riots,1,Unit D,Medium

699,Assam,2024-03-01,Ambush,2,Patrol Unit B,High

700,Chhattisgarh,2024-10-04,Ambush,7,Unit C,Low

701,West Bengal,2024-11-27,Search Operation,7,Patrol Unit A,High

702,Assam,2024-07-05,Riots,4,Unit D,High

703,West Bengal,2024-11-22,Search Operation,6,Patrol Unit B,Low

704,Kerala,2024-03-31,Search Operation,0,Patrol Unit B,Medium

705,Jammu & Kashmir,2024-04-03,Patrolling,7,Patrol Unit A,Low

706,Kerala,2024-12-17,IED Blast,7,Patrol Unit B,High

707,Assam,2024-03-19,Patrolling,4,Patrol Unit A,Medium

708,Chhattisgarh,2024-05-30,Patrolling,4,Patrol Unit A,Low

709,Jammu & Kashmir,2024-02-29,Riots,4,Unit D,High

710,Assam,2024-11-27,Encounter,3,Patrol Unit A,Low

711,Chhattisgarh,2024-09-05,Search Operation,4,Unit C,Medium

712,Jammu & Kashmir,2024-08-10,Encounter,9,Patrol Unit B,Low

713,West Bengal,2024-10-17,Patrolling,9,Unit D,Medium

714,Chhattisgarh,2024-07-07,Riots,8,Local Intel,Medium

715,Kerala,2024-10-24,Ambush,4,Patrol Unit B,Low

716,Odisha,2024-02-29,IED Blast,10,Patrol Unit A,Low

717,Jammu & Kashmir,2024-12-01,IED Blast,0,Local Intel,Medium

718,Odisha,2024-03-23,Ambush,6,Patrol Unit B,High

719,Assam,2024-12-28,Encounter,10,Unit D,High

720,Chhattisgarh,2024-10-20,Encounter,10,Unit D,Medium

721,Jammu & Kashmir,2024-06-24,Ambush,8,Unit D,High

722,West Bengal,2024-09-22,Patrolling,4,Local Intel,Low

723,Bihar,2024-04-18,Search Operation,2,Unit C,Medium

724,Jammu & Kashmir,2024-08-24,Ambush,8,Patrol Unit A,Medium

725,Odisha,2024-07-11,Search Operation,4,Unit C,Low

726,Jammu & Kashmir,2024-10-13,IED Blast,9,Patrol Unit B,Medium

727,Bihar,2024-08-06,Encounter,3,Local Intel,Low

728,Odisha,2024-11-30,Riots,9,Local Intel,Medium

729,Kerala,2024-04-29,Ambush,0,Unit C,Medium

730,Maharashtra,2024-12-15,IED Blast,9,Unit D,Low

731,Bihar,2024-02-26,Riots,8,Unit D,Medium

732,Odisha,2024-08-01,Riots,2,Patrol Unit A,High

733,Kerala,2024-03-19,Riots,5,Unit D,Low

734,Maharashtra,2024-06-06,Ambush,6,Patrol Unit A,Low

735,Jammu & Kashmir,2024-04-29,Search Operation,9,Patrol Unit B,Low

736,Jammu & Kashmir,2024-06-30,Riots,0,Unit C,Low

737,West Bengal,2024-11-04,Ambush,4,Unit D,High

738,Odisha,2024-07-27,Search Operation,8,Unit D,High

739,Odisha,2024-04-03,Search Operation,10,Unit C,Medium

740,Jammu & Kashmir,2024-06-09,Search Operation,4,Unit D,High

741,Jammu & Kashmir,2024-02-22,Search Operation,8,Unit C,Medium

742,West Bengal,2024-10-29,IED Blast,10,Patrol Unit B,Medium

743,Maharashtra,2024-02-01,Ambush,10,Local Intel,Low

744,Jammu & Kashmir,2024-05-20,IED Blast,9,Unit C,High

745,Jammu & Kashmir,2024-08-12,Search Operation,10,Unit C,High

746,Odisha,2024-10-16,IED Blast,9,Unit D,High

747,Kerala,2025-01-20,IED Blast,9,Patrol Unit A,Medium

748,Jammu & Kashmir,2024-09-25,Encounter,8,Local Intel,High

749,Odisha,2024-08-13,Patrolling,0,Patrol Unit A,Medium

750,Maharashtra,2024-03-19,IED Blast,9,Unit C,Medium

751,Jammu & Kashmir,2024-09-14,Encounter,2,Local Intel,Medium

752,Jammu & Kashmir,2024-08-18,Encounter,2,Local Intel,High

753,Bihar,2024-02-29,Encounter,7,Unit D,Medium

754,Jammu & Kashmir,2024-06-10,Patrolling,0,Local Intel,High

755,Odisha,2024-11-30,IED Blast,9,Unit C,High

756,Assam,2024-02-24,IED Blast,8,Patrol Unit A,Low

757,Kerala,2024-11-23,Patrolling,10,Unit C,Low

758,West Bengal,2024-05-10,Ambush,5,Local Intel,Low

759,Jammu & Kashmir,2024-08-27,Riots,4,Patrol Unit B,High

760,Maharashtra,2024-10-22,Riots,1,Local Intel,High

761,West Bengal,2024-12-12,Patrolling,1,Unit D,High

762,Bihar,2024-05-17,Riots,3,Local Intel,Medium

763,Assam,2024-10-28,Patrolling,10,Patrol Unit A,Medium

764,West Bengal,2024-08-01,IED Blast,5,Unit D,High

765,West Bengal,2024-03-08,Patrolling,1,Unit D,Medium

766,Assam,2024-02-24,Search Operation,6,Local Intel,Low

767,Jammu & Kashmir,2024-07-02,Encounter,2,Patrol Unit B,Medium

768,Assam,2024-08-14,Search Operation,8,Patrol Unit A,Medium

769,Kerala,2024-02-23,IED Blast,2,Unit D,Low

770,West Bengal,2024-06-27,Patrolling,2,Local Intel,Low

771,Maharashtra,2024-04-04,Patrolling,0,Unit D,Low

772,Bihar,2025-01-07,Patrolling,3,Unit D,Medium

773,Maharashtra,2024-01-30,Search Operation,3,Unit C,Medium

774,Kerala,2024-06-05,Ambush,5,Patrol Unit A,Medium

775,Chhattisgarh,2024-05-20,Encounter,5,Unit D,High

776,Chhattisgarh,2024-06-27,Riots,5,Patrol Unit B,Medium

777,Chhattisgarh,2024-12-19,Patrolling,5,Unit C,High

778,West Bengal,2025-01-01,Search Operation,5,Patrol Unit B,High

779,Assam,2024-03-22,Search Operation,4,Patrol Unit A,High

780,Odisha,2024-12-08,Riots,9,Unit D,Low

781,Jammu & Kashmir,2024-10-10,IED Blast,5,Local Intel,Low

782,Odisha,2024-05-12,Riots,9,Local Intel,High

783,Kerala,2024-05-03,Riots,4,Unit D,Medium

784,Assam,2024-02-08,Ambush,3,Unit C,Medium

785,Maharashtra,2025-01-25,Ambush,3,Patrol Unit B,Low

786,Maharashtra,2024-06-30,Ambush,10,Patrol Unit A,Low

787,Bihar,2024-12-26,Patrolling,0,Patrol Unit B,High

788,Chhattisgarh,2024-07-08,Ambush,2,Patrol Unit A,High

789,West Bengal,2024-12-04,Ambush,3,Unit D,Low

790,Jammu & Kashmir,2024-05-12,Patrolling,0,Patrol Unit B,Medium

791,Kerala,2025-01-25,Riots,0,Local Intel,Low

792,West Bengal,2024-12-09,Patrolling,9,Local Intel,High

793,West Bengal,2024-05-15,Search Operation,6,Local Intel,High

794,Kerala,2024-05-17,Encounter,7,Local Intel,High

795,Kerala,2024-10-05,Search Operation,3,Patrol Unit A,Medium

796,West Bengal,2024-08-19,Ambush,8,Local Intel,Low

797,Maharashtra,2024-09-05,Patrolling,9,Patrol Unit A,Medium

798,Assam,2024-03-09,Ambush,6,Unit D,Low

799,Maharashtra,2024-05-10,Patrolling,5,Patrol Unit A,High

800,Maharashtra,2024-09-15,Encounter,4,Local Intel,Low

801,Chhattisgarh,2024-06-01,Riots,8,Patrol Unit B,Medium

802,Chhattisgarh,2024-05-06,Encounter,7,Patrol Unit B,High

803,Kerala,2024-02-13,Encounter,7,Patrol Unit A,Low

804,Bihar,2024-09-18,Search Operation,10,Unit C,Low

805,Maharashtra,2025-01-19,Patrolling,2,Unit D,Medium

806,Bihar,2024-07-24,Riots,4,Unit C,Medium

807,Jammu & Kashmir,2024-05-11,Ambush,10,Unit D,Low

808,Maharashtra,2024-11-22,Riots,8,Patrol Unit B,Low

809,Kerala,2024-01-27,Riots,7,Unit C,Low

810,West Bengal,2025-01-25,Search Operation,5,Unit D,Medium

811,Odisha,2024-11-12,Encounter,4,Unit C,Low

812,Chhattisgarh,2024-09-09,Ambush,9,Unit C,Low

813,Jammu & Kashmir,2024-04-09,Patrolling,1,Patrol Unit B,Medium

814,West Bengal,2024-06-22,Riots,8,Local Intel,High

815,Jammu & Kashmir,2024-10-22,Riots,6,Unit D,High

816,Kerala,2024-09-19,Search Operation,2,Unit C,High

817,Odisha,2024-07-16,Encounter,0,Unit D,Medium

818,Kerala,2024-04-12,IED Blast,1,Unit C,High

819,Assam,2024-03-05,Ambush,7,Local Intel,High

820,Chhattisgarh,2024-08-15,Ambush,7,Patrol Unit A,High

821,West Bengal,2024-02-01,Encounter,1,Patrol Unit B,Medium

822,Assam,2024-07-30,Search Operation,6,Local Intel,Medium

823,Odisha,2024-04-30,Ambush,5,Unit C,Medium

824,Bihar,2024-04-15,Ambush,10,Patrol Unit B,Low

825,Chhattisgarh,2024-01-30,IED Blast,10,Local Intel,Medium

826,Kerala,2024-03-01,Riots,2,Unit C,High

827,Odisha,2024-02-15,IED Blast,3,Local Intel,Medium

828,Maharashtra,2024-04-29,Encounter,0,Unit D,Low

829,Maharashtra,2024-06-22,Search Operation,5,Patrol Unit A,High

830,Assam,2024-03-15,Search Operation,0,Local Intel,Low

831,Jammu & Kashmir,2024-10-18,Search Operation,8,Unit D,Medium

832,West Bengal,2024-05-13,Riots,10,Unit D,High

833,Odisha,2024-06-28,Ambush,9,Unit D,Low

834,Kerala,2024-05-28,Riots,3,Unit D,Medium

835,Maharashtra,2024-05-09,Patrolling,7,Patrol Unit A,High

836,Assam,2024-03-21,Riots,4,Unit C,Medium

837,West Bengal,2025-01-23,Encounter,1,Patrol Unit B,Medium

838,Kerala,2024-06-14,Search Operation,3,Patrol Unit A,Medium

839,Chhattisgarh,2024-04-23,IED Blast,1,Local Intel,High

840,Chhattisgarh,2024-07-12,IED Blast,1,Unit C,Medium

841,Maharashtra,2024-05-13,Riots,1,Unit D,Medium

842,Jammu & Kashmir,2024-08-28,Ambush,4,Unit C,Medium

843,Odisha,2024-09-22,Ambush,9,Patrol Unit B,Low

844,Bihar,2024-12-17,Encounter,6,Patrol Unit A,Medium

845,Assam,2024-03-19,Patrolling,9,Local Intel,Low

846,Bihar,2024-08-01,Encounter,4,Unit C,Medium

847,Assam,2024-04-16,Riots,8,Patrol Unit B,High

848,Jammu & Kashmir,2024-11-26,Ambush,9,Local Intel,High

849,Chhattisgarh,2024-06-22,Encounter,0,Unit D,High

850,West Bengal,2024-09-04,Patrolling,5,Patrol Unit A,High

851,Jammu & Kashmir,2024-10-22,Search Operation,2,Patrol Unit B,High

852,Kerala,2024-02-16,Ambush,10,Unit C,Low

853,West Bengal,2024-03-01,Search Operation,4,Unit D,Low

854,West Bengal,2024-09-07,Encounter,6,Patrol Unit B,Medium

855,Chhattisgarh,2024-02-06,Encounter,10,Local Intel,Low

856,Jammu & Kashmir,2024-01-31,Search Operation,9,Unit D,Low

857,Kerala,2024-10-26,Ambush,6,Patrol Unit A,High

858,Bihar,2024-12-12,Riots,0,Patrol Unit A,High

859,Maharashtra,2024-03-21,IED Blast,1,Patrol Unit B,Low

860,Chhattisgarh,2024-12-27,Encounter,7,Unit D,High

861,Maharashtra,2024-12-17,Ambush,8,Unit C,Low

862,Odisha,2024-03-13,Patrolling,5,Patrol Unit A,High

863,Maharashtra,2024-06-08,IED Blast,9,Unit D,Low

864,West Bengal,2024-08-05,IED Blast,4,Local Intel,Medium

865,Maharashtra,2025-01-01,Encounter,9,Unit D,Medium

866,Kerala,2024-03-31,Ambush,5,Local Intel,High

867,Assam,2024-11-19,IED Blast,4,Unit D,Medium

868,West Bengal,2024-03-16,IED Blast,5,Unit D,Medium

869,Jammu & Kashmir,2024-12-30,Search Operation,7,Local Intel,Medium

870,Chhattisgarh,2024-11-23,IED Blast,9,Local Intel,Low

871,Maharashtra,2024-05-18,Patrolling,1,Unit C,Medium

872,Bihar,2024-08-15,Riots,0,Patrol Unit B,Low

873,West Bengal,2024-05-05,Riots,6,Patrol Unit A,High

874,Bihar,2024-05-07,Patrolling,5,Local Intel,Low

875,Bihar,2024-07-22,IED Blast,9,Patrol Unit A,High

876,Jammu & Kashmir,2024-08-28,Search Operation,7,Unit C,High

877,Kerala,2024-07-05,Patrolling,0,Patrol Unit B,Low

878,Assam,2024-05-12,Patrolling,7,Patrol Unit A,Medium

879,Kerala,2024-11-07,Patrolling,7,Patrol Unit A,Low

880,Jammu & Kashmir,2024-04-12,Search Operation,6,Unit D,Low

881,Chhattisgarh,2024-04-09,Ambush,3,Patrol Unit B,Low

882,West Bengal,2025-01-16,Patrolling,7,Patrol Unit A,Medium

883,Chhattisgarh,2024-07-04,Patrolling,6,Unit D,Medium

884,West Bengal,2024-10-07,Search Operation,6,Local Intel,Low

885,Maharashtra,2024-07-07,Encounter,5,Patrol Unit A,High

886,Maharashtra,2025-01-22,Encounter,0,Patrol Unit A,Low

887,Assam,2024-07-01,Patrolling,3,Unit C,Low

888,Odisha,2024-09-07,Search Operation,3,Local Intel,High

889,Assam,2024-04-16,Patrolling,2,Patrol Unit A,High

890,Bihar,2024-12-02,IED Blast,3,Local Intel,High

891,Chhattisgarh,2024-12-01,Riots,0,Unit C,Low

892,Assam,2024-06-10,Riots,6,Patrol Unit B,High

893,Assam,2024-04-28,Ambush,6,Unit C,High

894,Assam,2024-06-07,Search Operation,5,Unit C,High

895,Jammu & Kashmir,2024-12-16,Ambush,9,Unit D,Low

896,Odisha,2024-08-23,Ambush,0,Patrol Unit B,High

897,Odisha,2024-03-24,Patrolling,3,Local Intel,Low

898,Assam,2024-02-15,IED Blast,4,Unit C,Medium

899,Odisha,2024-02-21,Encounter,3,Patrol Unit A,High

900,Jammu & Kashmir,2024-04-15,Ambush,0,Patrol Unit A,High

901,Odisha,2024-05-21,Search Operation,7,Unit D,High

902,Assam,2024-09-18,Ambush,6,Unit C,High

903,West Bengal,2024-12-14,Encounter,10,Unit D,Medium

904,Kerala,2025-01-12,Riots,10,Unit C,Medium

905,Jammu & Kashmir,2024-07-08,IED Blast,2,Local Intel,Low

906,Odisha,2024-03-27,Encounter,1,Unit D,Low

907,Maharashtra,2024-12-30,IED Blast,8,Unit C,High

908,Bihar,2024-11-24,Ambush,1,Unit C,Medium

909,West Bengal,2024-06-05,Ambush,0,Patrol Unit B,Low

910,Bihar,2024-10-19,Patrolling,1,Patrol Unit A,High

911,Odisha,2024-12-18,Search Operation,3,Local Intel,Medium

912,Jammu & Kashmir,2024-05-27,Riots,2,Unit D,Medium

913,Kerala,2024-06-24,Ambush,6,Patrol Unit A,Low

914,Odisha,2024-11-03,Riots,6,Unit D,Low

915,West Bengal,2024-03-10,IED Blast,2,Patrol Unit B,High

916,Odisha,2024-02-08,Patrolling,1,Local Intel,High

917,Maharashtra,2024-04-11,Patrolling,7,Unit D,Low

918,Assam,2024-09-27,Search Operation,4,Unit D,Low

919,Chhattisgarh,2024-10-21,Ambush,3,Patrol Unit B,Medium

920,Kerala,2024-11-08,Ambush,8,Local Intel,High

921,Chhattisgarh,2024-10-24,Search Operation,10,Patrol Unit B,High

922,Jammu & Kashmir,2024-07-02,IED Blast,8,Patrol Unit A,Low

923,Bihar,2024-05-16,IED Blast,0,Unit D,Low

924,Bihar,2024-08-25,Encounter,2,Unit D,High

925,West Bengal,2024-03-20,Ambush,7,Local Intel,High

926,Bihar,2024-04-18,Ambush,7,Patrol Unit B,Low

927,Kerala,2024-11-08,Ambush,9,Unit C,High

928,Assam,2024-02-24,IED Blast,8,Unit C,Medium

929,Bihar,2025-01-05,Riots,8,Patrol Unit B,Low

930,Chhattisgarh,2024-09-06,Encounter,0,Local Intel,High

931,Maharashtra,2024-12-17,Encounter,1,Patrol Unit B,High

932,Bihar,2024-11-19,Search Operation,5,Unit D,High

933,Maharashtra,2024-09-21,Encounter,6,Patrol Unit A,Low

934,Jammu & Kashmir,2024-03-11,Encounter,1,Patrol Unit A,High

935,Maharashtra,2024-03-19,Search Operation,4,Patrol Unit B,Medium

936,Odisha,2024-06-26,Riots,4,Unit C,Medium

937,Kerala,2024-12-31,Ambush,7,Patrol Unit B,Low

938,Jammu & Kashmir,2024-04-06,Search Operation,6,Unit C,High

939,West Bengal,2024-03-19,Riots,5,Patrol Unit A,Low

940,Chhattisgarh,2024-12-28,Riots,3,Patrol Unit B,Medium

941,Kerala,2024-05-02,IED Blast,3,Patrol Unit A,High

942,Kerala,2024-04-19,Search Operation,4,Patrol Unit A,Medium

943,Assam,2024-02-28,Patrolling,4,Local Intel,Low

944,Assam,2024-07-04,IED Blast,8,Patrol Unit B,High

945,Odisha,2024-11-19,Patrolling,4,Patrol Unit A,High

946,Jammu & Kashmir,2024-08-11,Search Operation,0,Unit D,Low

947,Bihar,2024-11-28,Riots,3,Local Intel,Low

948,Bihar,2024-02-21,Encounter,5,Unit C,Medium

949,Kerala,2024-03-16,Search Operation,7,Unit D,Medium

950,Bihar,2024-10-30,Encounter,5,Patrol Unit A,Medium

951,Bihar,2024-06-11,IED Blast,4,Local Intel,Medium

952,West Bengal,2024-03-13,Search Operation,9,Patrol Unit A,High

953,Odisha,2024-06-12,Search Operation,4,Unit C,Medium

954,Kerala,2024-06-14,Encounter,3,Patrol Unit A,High

955,Bihar,2024-05-09,Encounter,2,Patrol Unit A,Medium

956,Jammu & Kashmir,2024-02-02,IED Blast,3,Patrol Unit B,Medium

957,Chhattisgarh,2024-07-30,Ambush,5,Unit C,Medium

958,Bihar,2024-12-08,IED Blast,9,Unit D,Low

959,Odisha,2024-02-04,IED Blast,6,Local Intel,High

960,Bihar,2024-07-05,Patrolling,5,Patrol Unit A,Medium

961,West Bengal,2024-11-09,Ambush,3,Patrol Unit A,High

962,Chhattisgarh,2025-01-18,Ambush,7,Unit C,Low

963,Maharashtra,2024-02-23,IED Blast,6,Unit C,Medium

964,Maharashtra,2024-07-10,Encounter,2,Patrol Unit A,Medium

965,Assam,2024-09-23,Riots,6,Patrol Unit B,High

966,Assam,2024-05-12,Encounter,4,Patrol Unit B,High

967,Odisha,2024-02-08,Encounter,4,Local Intel,Low

968,West Bengal,2024-10-01,IED Blast,6,Unit C,Low

969,Assam,2024-03-19,Riots,9,Local Intel,Low

970,Assam,2025-01-19,Riots,5,Unit D,Low

971,Kerala,2025-01-22,Encounter,7,Local Intel,High

972,Bihar,2024-05-30,Encounter,5,Unit D,Medium

973,Maharashtra,2024-01-28,Ambush,2,Patrol Unit A,Medium

974,Jammu & Kashmir,2024-09-22,Patrolling,10,Patrol Unit A,Low

975,Kerala,2024-04-11,Ambush,1,Unit D,Low

976,Chhattisgarh,2024-09-25,Search Operation,4,Patrol Unit B,Medium

977,Odisha,2024-11-16,Search Operation,0,Unit C,Medium

978,Bihar,2024-07-31,IED Blast,10,Unit C,Medium

979,Odisha,2024-05-02,Search Operation,3,Unit C,Medium

980,Bihar,2024-07-06,Riots,10,Patrol Unit B,Low

981,West Bengal,2024-07-22,Search Operation,1,Unit C,Medium

982,Bihar,2024-09-09,Patrolling,10,Patrol Unit B,Low

983,Assam,2024-05-26,Riots,0,Patrol Unit A,Medium

984,Jammu & Kashmir,2024-11-11,Encounter,1,Unit D,High

985,Bihar,2024-08-20,IED Blast,3,Patrol Unit B,Medium

986,West Bengal,2024-11-17,Ambush,3,Patrol Unit A,High

987,West Bengal,2024-10-29,Patrolling,6,Unit D,Low

988,Kerala,2024-07-24,Riots,2,Local Intel,Low

989,Maharashtra,2024-02-21,Encounter,4,Patrol Unit A,Medium

990,Maharashtra,2024-09-05,Search Operation,7,Unit C,Medium

991,Jammu & Kashmir,2024-04-05,Encounter,7,Local Intel,Low

992,West Bengal,2024-11-26,Encounter,7,Local Intel,High

993,West Bengal,2024-05-25,Search Operation,5,Patrol Unit B,High

994,Bihar,2025-01-02,Encounter,9,Unit D,High

995,Maharashtra,2024-03-07,Search Operation,8,Unit C,High

996,Maharashtra,2024-10-18,Encounter,2,Patrol Unit A,Medium

997,Maharashtra,2024-07-20,Riots,10,Unit D,Medium

998,Maharashtra,2024-05-28,Patrolling,8,Unit D,Low

999,Assam,2024-04-03,Ambush,10,Patrol Unit A,High

1000,Chhattisgarh,2024-05-28,Ambush,3,Unit C,High

Step 1: Upload the CSV File to Google Colab

In Google Colab, click on the folder icon on the left sidebar to open the file explorer.
Click the upload icon (a file with an upward arrow) in the file explorer.
Select the CRPF_Incidents_Large.csv file you downloaded.

Step 2: Import Required Libraries and Read the CSV File

In a new code cell in Colab, copy and paste the following code to import the required libraries and load the CSV file using pandas:

python

CopyEdit

import pandas as pd

# Load the uploaded CSV file

file_path = "/content/CRPF_Incidents_Large.csv"

df = pd.read_csv(file_path)

# Display the first few rows

print("Data Preview:")

print(df.head())

Run the code cell.

Step 3: Analyze and Visualize the Data

In a new code cell, paste the following code to generate some basic insights and visualize the data using matplotlib:

python

CopyEdit

import matplotlib.pyplot as plt

# Basic statistics

print("Dataset Overview:")

print(df.describe())

# Incident Type Distribution

incident_counts = df['Incident_Type'].value_counts()

plt.figure(figsize=(8, 6))

incident_counts.plot(kind='bar', title="Incident Type Distribution")

plt.xlabel("Incident Type")

plt.ylabel("Number of Incidents")

plt.show()

# Severity Distribution

severity_counts = df['Severity'].value_counts()

plt.figure(figsize=(8, 6))

severity_counts.plot(kind='pie', autopct='%1.1f%%', title="Severity Distribution")

plt.ylabel("") # To remove y-label for pie chart

plt.show()

Step 4: Preparing Data for Prediction

To predict future incidents or identify patterns, we can start by encoding categorical data and applying a basic machine learning model.

Here’s the code to prepare the data:

python

CopyEdit

from sklearn.model_selection import train_test_split

from sklearn.ensemble import RandomForestClassifier

from sklearn.preprocessing import LabelEncoder

from sklearn.metrics import classification_report, accuracy_score

# Encode categorical data

label_encoders = {}

for column in ['Location', 'Incident_Type', 'Reported_By', 'Severity']:

le = LabelEncoder()

df[column] = le.fit_transform(df[column])

label_encoders[column] = le

# Features and target variable

X = df[['Location', 'Incident_Type', 'Casualties', 'Reported_By']]

y = df['Severity'] # Predicting severity

# Split the data

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Initialize and train model

model = RandomForestClassifier(random_state=42)

model.fit(X_train, y_train)

# Predictions

y_pred = model.predict(X_test)

# Evaluation

print("Classification Report:")

print(classification_report(y_test, y_pred))

print("Accuracy Score:")

print(accuracy_score(y_test, y_pred))

What This Code Does:

Encodes Categorical Data: Converts text columns (e.g., Location, Incident_Type, etc.) into numeric format using LabelEncoder.
Defines Features and Target:
- Features (X): Includes Location, Incident_Type, Casualties, and Reported_By.
- Target (y): Predicts Severity.
Splits the Dataset: Divides the data into training (70%) and testing (30%) sets.
Trains a Random Forest Classifier: A robust algorithm for classification tasks.
Evaluates the Model: Prints the classification report and accuracy score.

Step 5: Predicting New Data

Here’s the code to test the model with new data:

python

CopyEdit

# Example new data for prediction

new_data = pd.DataFrame({

"Location": ["Chhattisgarh", "Jammu & Kashmir", "Assam"],

"Incident_Type": ["Encounter", "IED Blast", "Riots"],

"Casualties": [3, 5, 0],

"Reported_By": ["Patrol Unit A", "Patrol Unit B", "Unit C"]

})

# Encode new data using the same encoders

for column in ['Location', 'Incident_Type', 'Reported_By']:

new_data[column] = label_encoders[column].transform(new_data[column])

# Make predictions

predictions = model.predict(new_data)

predicted_severity = label_encoders['Severity'].inverse_transform(predictions)

# Display predictions

new_data['Predicted_Severity'] = predicted_severity

print("Predictions for New Data:")

print(new_data)

Step 6: Visualizing Feature Importance

This step helps explain which features contribute the most to the model's predictions:

python

CopyEdit

# Feature Importance Visualization

importances = model.feature_importances_

features = X.columns

plt.figure(figsize=(8, 6))

plt.bar(features, importances)

plt.title("Feature Importance")

plt.xlabel("Features")

plt.ylabel("Importance Score")

plt.show()

What These Codes Do:

New Data Prediction:
- You provide new incident details (like Location, Incident_Type, etc.).
- The model predicts the severity for these incidents.
Feature Importance:
- Visualizes which features (e.g., Location, Casualties) have the most impact on predictions.

Step 5: Predicting New Data (Updated for Clarity)

Copy and paste this updated code:

python

CopyEdit

# Example new data for prediction

new_data = pd.DataFrame({

"Location": ["Chhattisgarh", "Jammu & Kashmir", "Assam"],

"Incident_Type": ["Encounter", "IED Blast", "Riots"],

"Casualties": [3, 5, 0],

"Reported_By": ["Patrol Unit A", "Patrol Unit B", "Unit C"]

})

# Encode new data using the same encoders

for column in ['Location', 'Incident_Type', 'Reported_By']:

new_data[column] = label_encoders[column].transform(new_data[column])

# Make predictions

predictions = model.predict(new_data)

predicted_severity = label_encoders['Severity'].inverse_transform(predictions)

# Add predictions to the DataFrame

new_data['Predicted_Severity'] = predicted_severity

# Display predictions in a clear format

print("Predictions for New Data:")

print(new_data[['Location', 'Incident_Type', 'Casualties', 'Reported_By', 'Predicted_Severity']])

Explanation:

Input Data: This code defines new incidents with Location, Incident_Type, Casualties, and Reported_By.
Encoding: Ensures the new data matches the numeric encoding of the trained model.
Prediction Output: Adds a Predicted_Severity column to the new data for clarity.
Print: Displays a clear table of inputs and their predicted severities.

Expected Output:

A table with the following structure:

kotlin

CopyEdit

Location Incident_Type Casualties Reported_By Predicted_Severity

0 Chhattisgarh Encounter 3 Patrol Unit A High

1 Jammu & Kashmir IED Blast 5 Patrol Unit B Medium

2 Assam Riots 0 Unit C Low

Two CSV files

save a file with the name user_data.csv

IP address,Video streaming history,Subscription details,Payment methods,Email ID,Return history,Purchase amount till this date

16.249.66.250,A; Two; House; Still,Non-Member,UPI,hicksmonica@yahoo.com,Yes,23963.08

148.142.102.204,Law; Various; Its; Pattern,Monthly Plan,Net Banking,lopezrobert@hotmail.com,Yes,30181.04

51.161.116.64,Nearly; Wish; Until; Figure; Heavy,Monthly Plan,Credit Card,jeffreyjones@yahoo.com,No,3987.4

7.144.239.20,Body,Annual Plan,Debit Card,matthew29@hotmail.com,No,12419.92

18.175.69.100,Seek; Partner,Monthly Plan,Credit Card,pittmanmartha@hotmail.com,No,34413.69

134.93.246.240,Issue; Impact; Produce,Annual Plan,UPI,mccoydavid@wilson.com,No,21630.25

156.106.194.245,Point; Goal; Do,Prime Member,Net Banking,trevorleach@yahoo.com,No,21322.08

104.128.150.214,Remain; Safe,Non-Member,Credit Card,tlawrence@walker.com,No,2291.95

135.119.234.89,Rise; Friend; Of,Annual Plan,UPI,njenkins@gmail.com,Yes,14929.72

60.108.118.92,Necessary; Fund; Watch,Non-Member,Wallet,bobby36@yahoo.com,No,29847.41

136.205.199.241,Minute,Annual Plan,Net Banking,brownjustin@guerrero.com,Yes,26946.59

162.200.44.125,Agency; Station; Class; Dark,Non-Member,Wallet,jeremy94@turner-williams.com,Yes,15305.29

214.201.97.62,Institution; Activity,Monthly Plan,Net Banking,masonalison@morris.info,Yes,2952.34

173.116.126.73,Listen,Non-Member,Credit Card,thenry@hotmail.com,No,26587.54

200.33.19.153,Physical; Open,Prime Member,Wallet,robinbrown@wang.com,Yes,4624.42

158.136.72.189,Within,Prime Member,Credit Card,heather81@yahoo.com,Yes,13940.5

109.151.171.74,Book; Deal; Bring,Non-Member,Net Banking,cherylmartinez@thompson.org,No,14858.83

38.169.249.110,House; Guess; Job,Annual Plan,UPI,murrayshannon@hotmail.com,No,5087.71

157.165.166.12,Worry; Open; Unit; State; Travel,Prime Member,Debit Card,justinstewart@gmail.com,No,49074.21

115.57.168.31,Late; Late; Life,Annual Plan,UPI,davidprice@mckee.com,No,2833.29

84.239.206.146,Whom; Notice; Even; Provide; Red,Annual Plan,Credit Card,benjamin46@brown.com,Yes,47998.63

84.122.48.97,View; The; Push; Wide,Monthly Plan,Credit Card,steven51@patterson.biz,Yes,4409.41

178.207.210.34,Happy,Annual Plan,Debit Card,jessicavargas@hotmail.com,No,41068.32

158.100.50.9,Grow,Non-Member,Credit Card,marqueztroy@hotmail.com,Yes,27929.42

22.18.44.219,Relationship; Chance; Set; Leg,Annual Plan,Debit Card,rwilson@figueroa.biz,Yes,24622.55

148.89.93.169,Point; Lawyer; Or; Can,Non-Member,UPI,bbrown@pearson.net,Yes,38228.83

138.77.131.251,Charge; Concern; Particular; Time; Manage,Prime Member,Credit Card,lukejackson@matthews.org,Yes,47985.63

165.97.182.96,Effect,Prime Member,Debit Card,wallssue@reid.com,No,35305.84

106.128.93.87,Often; Alone,Prime Member,Credit Card,william56@davis.org,No,40920.14

53.17.16.119,Knowledge,Prime Member,Wallet,soliswilliam@green-vang.com,No,23781.67

109.19.205.72,Miss; Reflect; Can; Tax; Stop,Annual Plan,Wallet,thomasbrittney@hotmail.com,No,43031.38

71.146.225.237,Future,Monthly Plan,Wallet,joe13@wright-miller.net,Yes,23321.46

21.31.166.98,Opportunity; Actually; Together; Soon,Monthly Plan,UPI,sanchezsusan@love.com,Yes,5268.54

108.136.93.118,Into; Serve; Instead; Throughout,Monthly Plan,Debit Card,jennifermoore@brooks.com,No,42462.77

220.60.47.80,Conference; Feeling; Last; Begin,Non-Member,Debit Card,alexander74@gmail.com,Yes,29034.87

176.84.115.17,Several; Focus,Monthly Plan,Credit Card,lindsay15@gmail.com,Yes,47548.81

39.151.159.215,Able; Enjoy; Court; Evidence; System,Prime Member,Credit Card,gregoryrodgers@roach.com,Yes,28603.16

179.148.117.141,Article,Prime Member,Wallet,xevans@evans-bernard.com,Yes,11972.92

56.190.248.26,Before; Budget,Annual Plan,Credit Card,zimmermanmark@hotmail.com,No,25587.69

10.56.81.199,Into; Time; Oil,Annual Plan,UPI,ronaldgibson@hobbs.com,Yes,42815.87

199.150.24.125,Plan; Unit; Year; Affect; Month,Prime Member,Debit Card,reynoldsnicole@yahoo.com,Yes,41338.66

101.35.216.60,How; Magazine; Stand; She,Annual Plan,Wallet,claytonosborn@yahoo.com,Yes,13705.38

30.252.125.137,Positive; Until; From; Should; Surface,Prime Member,Credit Card,huangalejandro@hotmail.com,Yes,12053.22

121.216.116.30,Hot,Monthly Plan,UPI,davidjacobs@kim.net,No,40311.92

124.198.170.153,Great; First; Behavior; Might,Prime Member,Debit Card,roger07@guerrero.com,Yes,36252.62

118.37.23.210,Down; Gun,Annual Plan,Debit Card,gina67@yahoo.com,No,47386.7

81.137.234.96,Possible,Non-Member,Net Banking,ismith@yahoo.com,Yes,43595.38

11.150.244.131,Law; Here,Annual Plan,Wallet,qgraham@yahoo.com,No,8571.84

14.125.86.127,Leave; Feeling,Annual Plan,Net Banking,james31@baker.com,No,43662.52

183.46.87.119,Decision; Across; Special; Effort; Over,Annual Plan,UPI,michael43@sullivan.org,Yes,1652.87

Save another file with name other_user_data.csv

IP address,Video streaming history,Subscription details,Payment methods,Email ID,Return history,Purchase amount till this date

18.9.163.26,Large; Five,Annual Plan,Credit Card,george86@hotmail.com,Yes,1674.08

221.165.221.79,Move; Painting,Monthly Plan,Net Banking,laura43@yahoo.com,Yes,18098.02

111.194.239.150,Should; Someone; Job; Skill; Art,Free Trial,Net Banking,joseph83@gmail.com,No,11372.89

55.236.143.56,Ask; Without; Notice; Will; Increase,Annual Plan,Wallet,walkerwilliam@griffin.info,No,48952.71

40.213.23.90,Minute; Develop; Discover; Business; Address,Monthly Plan,Wallet,mconrad@gmail.com,Yes,12848.38

220.31.152.251,Person; On,Free Trial,Wallet,macdonaldbriana@gmail.com,No,6908.27

66.181.34.162,Throw; Everybody; Edge,Free Trial,Credit Card,cynthiajackson@yahoo.com,Yes,33165.87

33.245.226.169,Half; Again; Arm,Free Trial,Credit Card,charlesshelley@gmail.com,Yes,26850.97

73.22.92.105,Different; Choice; Foot; Necessary,Non-Member,UPI,tylerarmstrong@yahoo.com,Yes,40660.61

141.223.118.141,Boy; Already; Child; Laugh; Different,Annual Plan,Wallet,nmacdonald@yahoo.com,No,22967.07

133.65.96.220,Standard; Kid; Yet,Free Trial,UPI,cabrerakimberly@johnson-hodge.org,Yes,14482.93

3.131.249.169,Anyone; Modern; Design; Sell; Mission,Annual Plan,Debit Card,michael14@rivera.net,No,14943.28

78.170.205.208,Share; Member; View; Family; Usually,Monthly Plan,Debit Card,crawfordnathan@hotmail.com,No,17755.28

152.126.183.244,More; Least,Monthly Plan,Net Banking,rosslatoya@hotmail.com,No,43124.02

177.115.49.255,Movie; Station,Non-Member,UPI,ucook@weaver-travis.info,No,27948.94

140.177.14.194,Factor; Defense; Coach; People,Free Trial,Net Banking,lindsey96@gmail.com,No,21471.99

182.213.192.93,Night,Monthly Plan,Debit Card,pphillips@gmail.com,No,35868.33

101.200.116.231,Provide,Free Trial,Wallet,bpatterson@reyes.com,Yes,6876.03

138.96.57.81,Heavy,Monthly Plan,Wallet,walkeredwin@yahoo.com,No,18834.24

68.107.215.152,Poor; Source; Design; Rather,Free Trial,Wallet,charles44@austin-johnson.com,Yes,34290.03

169.138.103.32,Political,Annual Plan,Net Banking,zachary62@herring.biz,Yes,13364.47

175.143.228.29,Plant,Free Trial,Credit Card,deanna21@edwards.com,Yes,21003.82

141.166.141.220,Crime,Free Trial,Net Banking,jessicajohnson@yahoo.com,No,33059.17

148.82.16.138,Benefit; Into; Discuss; Represent,Free Trial,Credit Card,shannonphillips@howell.com,Yes,40392.81

196.170.232.44,Pretty,Annual Plan,Wallet,staceyfreeman@cabrera.org,No,5569.28

218.32.104.4,Network; Pretty; Unit; Culture; Foot,Monthly Plan,Debit Card,hhorton@gmail.com,Yes,30174.75

112.37.19.101,Offer; Player; Accept; Grow; Clear,Non-Member,Net Banking,reevescaroline@yahoo.com,No,12240.03

176.72.6.248,Democratic; Tv; Analysis,Non-Member,Wallet,virginia75@griffith.com,No,9182.99

70.32.180.102,He; Article; Low; Light; Difference,Monthly Plan,UPI,jrowe@gates-buchanan.info,No,10707.69

28.224.146.240,Change; Along; Method; Thus,Annual Plan,Net Banking,jennifermitchell@yahoo.com,Yes,19257.34

105.177.12.4,Something; Recognize; Too,Monthly Plan,Debit Card,graygreg@lawrence.com,Yes,27640.53

111.186.84.137,Sport; Such,Free Trial,Wallet,cummingswilliam@harris.com,Yes,40863.27

213.118.229.231,Reflect; Itself; Again; Radio; Than,Monthly Plan,Debit Card,michael59@rogers.biz,Yes,27174.34

22.185.171.76,Himself; Else; Including; Record; Major,Monthly Plan,Credit Card,owenscarolyn@hotmail.com,No,8571.43

29.150.187.187,Example; Draw; Message,Free Trial,UPI,lthomas@perez.com,No,17541.53

218.172.34.132,Teacher; Drive; Candidate; Beat,Free Trial,Credit Card,mannsamuel@harris.com,Yes,22669.28

5.137.68.53,Ability; House,Monthly Plan,Wallet,chadgarcia@yahoo.com,No,4982.53

140.56.9.53,Up,Non-Member,UPI,pjackson@gmail.com,Yes,26744.73

28.184.129.161,Month; Race; This; Many,Non-Member,Credit Card,howelljonathan@suarez-williams.biz,No,43690.33

174.221.2.166,Dark; Protect,Monthly Plan,Wallet,angelaandrews@elliott-rodriguez.com,No,3766.25

38.132.49.124,Western; Low; College,Free Trial,Debit Card,geraldrodriguez@hotmail.com,No,15378.1

35.233.113.185,Yeah; Could; Environmental; Power,Annual Plan,UPI,cynthia41@hotmail.com,Yes,26035.79

89.239.149.109,Scientist; Standard; Rule; Amount,Free Trial,Debit Card,sherigrant@davis.info,No,31944.84

23.64.234.180,Factor; Consider; Assume,Free Trial,Net Banking,kdixon@miranda.org,No,23435.71

4.163.167.221,Sometimes; Else; Table; Music,Annual Plan,Net Banking,powellkeith@wilson.biz,No,34815.31

198.98.205.122,Tonight,Annual Plan,Debit Card,jennifer42@hotmail.com,Yes,31493.27

86.253.237.81,Down,Non-Member,Credit Card,ncarlson@santiago.com,No,41932.71

159.117.159.234,Read; Prevent,Annual Plan,Wallet,webbstephen@gmail.com,No,31304.67

143.67.90.90,On; Discuss; Visit; Avoid,Annual Plan,Wallet,vanessagreen@chen.com,Yes,2032.34

183.131.115.45,Many,Free Trial,Wallet,philip65@kent.biz,Yes,15690.88

Step 1: Use the following code to upload the files using google.colab.files:

from google.colab import files

uploaded = files.upload()

This will open a file upload dialog. Upload your user_data.csv and other_user_data.csv files.

Now, run the same code again:

uploaded = files.upload() # Upload the second file

This will allow you to upload the second file.

Step 2: Load the uploaded files into pandas DataFrames using this code:

import pandas as pd

# Load the first file

user_data = pd.read_csv("user_data.csv")

# Load the second file

other_user_data = pd.read_csv("other_user_data.csv")

Step 3 - Verify that the files are loaded correctly by displaying the first few rows of each DataFrame:

# Display the first few rows of the first DataFrame

print("User Data:")

print(user_data.head())

# Display the first few rows of the second DataFrame

print("\nOther User Data:")

print(other_user_data.head())

Step 4: Combine both DataFrames into one.

# Combine the two DataFrames

combined_data = pd.concat([user_data, other_user_data])

Identify the top 5 purchasing persons by sorting the combined data based on the "Purchase amount till this date" column.

# Sort combined data by "Purchase amount till this date" in descending order

top_5_purchasers = combined_data.sort_values(by="Purchase amount till this date", ascending=False).head(5)

# Display the top 5 purchasers

print(top_5_purchasers)

Step5
Analyze the distribution of "Purchase amount till this date" across all users using a histogram.

import matplotlib.pyplot as plt

# Plot histogram of purchase amounts

plt.figure(figsize=(10, 6))

plt.hist(combined_data["Purchase amount till this date"], bins=20, edgecolor='k')

plt.title("Distribution of Purchase Amounts", fontsize=16)

plt.xlabel("Purchase Amount", fontsize=14)

plt.ylabel("Frequency", fontsize=14)

plt.grid(axis='y', linestyle='--', alpha=0.7)

plt.show()

Step 6

Analyze the trend of average purchase amount for different subscription types.

# Group by subscription details and calculate average purchase amount

subscription_trend = combined_data.groupby("Subscription details")["Purchase amount till this date"].mean()

# Plot the trend

plt.figure(figsize=(10, 6))

subscription_trend.plot(kind='bar', color='skyblue', edgecolor='k')

plt.title("Average Purchase Amount by Subscription Type", fontsize=16)

plt.xlabel("Subscription Type", fontsize=14)

plt.ylabel("Average Purchase Amount", fontsize=14)

plt.xticks(rotation=45)

plt.grid(axis='y', linestyle='--', alpha=0.7)

plt.show()