1. What is the primary goal of data science?
-
To manipulate data for financial profit
-
To store large datasets
-
To create complex mathematical models only
-
To convert raw data into valuable insights and predictions
2. Which of the following is a common tool used for data visualization?
-
Jupyter Notebook
-
Tableau
-
TensorFlow
-
Hadoop
3. What does the term 'Big Data' refer to?
-
Large volumes of data that are too complex for traditional processing tools
-
Small sets of structured data
-
Data that only businesses can access
-
Only unstructured data
4. Which of the following is an essential skill for data scientists?
-
Graphic design
-
Data cleaning and preprocessing
-
Writing and editing documents
-
Social media management
5. Which AI technique is primarily used in supervised learning?
-
Neural Networks
-
Deep Learning
-
K-means Clustering
-
Decision Trees
6. Which of the following is an example of unstructured data?
-
Customer age data
-
Audio recordings
-
Data from a database
-
Excel spreadsheets
7. What is a feature in a dataset?
-
The label or target variable
-
A row in the dataset
-
A column that holds measurable characteristics
-
A group of algorithms
8. Which of the following is an example of supervised learning?
-
K-means clustering
-
Linear regression
-
Principal Component Analysis
-
Reinforcement learning
9. Which of the following algorithms is used for classification tasks in machine learning?
-
K-means clustering
-
Support Vector Machines
-
Linear Regression
-
K-Nearest Neighbors
10. What is the purpose of normalization in data preprocessing?
-
To scale features so they have a standard range
-
To remove irrelevant data
-
To categorize data into groups
-
To delete duplicate entries
11. What type of machine learning problem does the 'K-means clustering' algorithm solve?
-
Regression
-
Classification
-
Unsupervised learning (clustering)
-
Reinforcement learning
12. What is the purpose of the confusion matrix in machine learning?
-
To track the performance of a machine learning model
-
To visualize the modelβs loss function
-
To test model predictions with multiple metrics
-
To analyze model errors in classification tasks
13. Which of the following tools is used for data wrangling and cleaning?
-
Scikit-learn
-
Pandas
-
Matplotlib
-
TensorFlow
14. What does 'feature engineering' refer to in machine learning?
-
Selecting and transforming raw data into meaningful input features for models
-
The process of selecting the appropriate machine learning algorithm
-
Cleaning the data by removing null values
-
Reducing the number of features for simpler models
15. Which of the following is a key benefit of using big data tools like Hadoop?
-
It simplifies data analysis by only allowing structured data
-
It speeds up the process of writing and editing code
-
It allows for processing of very large datasets across multiple servers
-
It creates automated content for websites
16. What is the main function of natural language processing (NLP) in AI?
-
To classify images based on visual features
-
To process and analyze human language data
-
To create recommendation systems
-
To predict stock market trends
17. Which technique is used to improve the generalization of a machine learning model?
-
Data augmentation
-
Deleting features
-
Using only a small sample of data
-
Simplifying the algorithm
18. Which of the following libraries is commonly used for building machine learning models in Python?
-
Pandas
-
NumPy
-
Scikit-learn
-
Flask
19. What is the purpose of a loss function in machine learning?
-
To measure how well the modelβs predictions align with the actual data
-
To select the best features for training
-
To perform feature scaling
-
To visualize data points in 3D
20. Which of the following is an example of reinforcement learning?
-
A robot learning to play a game by receiving rewards for correct actions
-
Clustering similar images together
-
Predicting housing prices based on features
-
Sorting products in a warehouse
21. What does the term 'dimensionality reduction' refer to in data science?
-
Reducing the amount of noise in the dataset
-
Reducing the number of input features while preserving data information
-
Increasing the size of the dataset
-
Removing outliers from the dataset
22. What is the purpose of the "train-test split" in machine learning?
-
To divide the data into training and testing sets to evaluate model performance
-
To reduce the size of the dataset
-
To create labels for unstructured data
-
To increase the amount of data available for analysis
23. What does a 'decision tree' algorithm do in machine learning?
-
It organizes data into hierarchical structures for classification or regression tasks
-
It creates a random set of data points
-
It stores large amounts of data in a structured format
-
It builds models for text analysis
24. Which of the following is a common evaluation metric for classification models?
-
Mean squared error
-
Accuracy
-
Precision
-
All of the above
25. What does the term "hyperparameter tuning" refer to?
-
Adjusting the features of the dataset
-
Selecting the right machine learning model
-
Fine-tuning model parameters to improve performance
-
Increasing the size of the training data
26. Which of the following is an example of unsupervised learning?
-
K-means clustering
-
Linear regression
-
Random forests
-
Naive Bayes
27. What is the primary purpose of cross-validation in machine learning?
-
To make predictions faster
-
To divide data into training and validation sets to evaluate model performance
-
To remove duplicate data
-
To automate feature engineering
28. In the context of deep learning, what does the term 'neural network' refer to?
-
A model inspired by the human brain to recognize patterns
-
A group of algorithms designed to sort large datasets
-
A system for storing and accessing data
-
A technique for dimensionality reduction
29. In deep learning, what is the role of an activation function?
-
To ensure that the output is scaled to a specific range
-
To introduce non-linearity in the neural network
-
To prevent overfitting
-
To monitor the model's training progress
30. In the context of machine learning, what is overfitting?
-
When a model is too simple to make accurate predictions
-
When a model performs well on the training data but poorly on new data
-
When a model doesn't fit the training data at all
-
When the model is too general to provide any insights