Unit I : Big data MCQ
1. How many V’s are present in the Big Data?
(A) 3
(B) 4
(C) 5
(D) 6
Answer: c
2. Data in a Relational Database is:
(A) Structured
(B) Un-Structured
(C) Semi Structured
(D) Meta Data
Answer: a
3. Data is found in the big data, in how many forms?
(A) 2
(B) 3
(C) 4
(D) 5
Answer: b
4. What kind of data is in Log files?
(A) Structured
(B) Un-Structured
(C) Semi Structured
(D) Meta Data
Answer: c
5. What is the overall percentage of the world’s total data created within the past two years is:
(A) 80%
(B) 85%
(C) 90%
(D) 95%
Answer: c
6. What are the main components present in the Big Data Analytics?
(A) MapReduce
(B) HDFS
(C) YARN
(D) All of the above
Answer: d
6. What are the major benefits of Big Data Processing?
(A) Businesses can utilize outside intelligence while taking decisions
(B) Improved customer service
(C) Better operational efficiency
(D) All of the above
Answer: d
7. The Hadoop is written in which programming language?
(A) C
(B) C++
(C) Java
(D) Python
Answer: c
8. Which of the following option given are
NOT related to the big data problem(s)?
(A) Parsing 5 MB XML file every 2 minutes
(B) Processing the twitter data
(C) Processing online banking transactions
(D) both (a) and (c)
Answer: d
9. What does the characteristics “Velocity” in Big Data represents?
(A) Speed of input data generation
(B) Speed of individual machine processors
(C) Speed of ONLY storing data
(D) Speed of storing and processing data
Answer: d
10. Which of the following are example(s) of Real Time Big Data Processing?
(A) Complex Event Processing (CEP) platforms
(B) Stock market data analysis
(C) Bank fraud transactions detection
(D) both (a) and (c)
Answer: d
11. Hadoop is open source.
(A) ALWAYS True
(B) True only for Apache Hadoop
(C) True only for Apache and Cloudera Hadoop
(D) ALWAYS False
Answer: b
12. Which of the following is not an example of Social Media?
(A) Twitter
(B) Google
(C) Insta
(D) Youtube
Answer: b
13. By 2027, the volume of data produced digitally will reach to
(A) TB
(B) YB
(C) ZB
(D) EB
Answer: c
14. For Drawing insights for Business what are need?
(A) Collecting the data
(B) Storing the data
(C) Analysing the data
(D) All the above
Answer: d
15. Does Facebook uses “Big Data ” to determine the behavior of its users? Is this True or False.
(A) TRUE
(B) FALSE
Answer: a
16. The Process of describing the data that is huge and complex to store and process is known as
(A) Analytics
(B) Data mining
(C) Big Data
(D) Data Warehouse
Answer: c
17. Data generated from online transactions is one of the example for volume of big dat(A) Is this true or False.
(A) TRUE
(B) FALSE
Answer: a
18. Velocity is the speed at which the data is processed
(A) TRUE
(B) FALSE
Answer: b
19. have a structure but cannot be stored in a database.
(A) Structured
(B) Semi-Structured
(C) Unstructured
(D) None of these
Answer: b
20. . refers to the ability to turn your data useful for business.
(A) Velocity
(B) Variety
(C) Value
(D) Volume
Answer: c
21. Value tells the trustworthiness of data in terms of quality and accuracy.
(A) TRUE
(B) FALSE
Answer: b
22. Files are divided into sized Chunks.
(A) Static
(B) Dynamic
(C) Fixed
(D) Variable
Answer: c
23. ______is an open source framework for storing data and running application on clusters of commodity hardware.
(A) HDFS
(B) Hadoop
(C) MapReduce
(D) Cloud
Answer: b
24. Hadoop MapReduce allows you to perform distributed parallel processing on large volumes of data quickly and efficiently: statement is True or False
(A) TRUE
(B) FALSE
Answer: a
25. In Relational database Management System the property of Scaling is apploicable.
(A) TRUE
(B) FALSE
Answer: b
26. Which of the following options is not the example of NoSql ?
(A) Google
(B) NetFlix
(C) Amazon
(D) CERN
Answer: c
27. Scalability and better performance of No SQL is Achieved by sacrificing ACID Compatibility Is it TRUE?
(A) TRUE
(B) FALSE
Answer: a
28. For Scalability and better performance of No SQL is attained by compromising ACID Compatibility Is it TRUE?
(A) TRUE
(B) FALSE
Answer: a
29. is a programming model for writing applications that can process Big Data in parallel on multiple nodes.
(A) HDFS
(B) MAP REDUCE
(C) HADOOP
(D) HIVE
Answer: b
30. Which of the following is a widely used and effective machine learning algorithm based on the idea of bagging?
(A) Decision Tree
(B) Regression
(C) Classification
(D) Random Forest
Answer: d
31. Data Set is the:
(A) Tweets stored in a flat file
(B) A collection of image files in a directory
(C) An extract of rows from a database table stored in a CSV formatted file
(D) All the above
Answer: d
32. Data analysis is the process of:
(A) Examining data to find facts
(B) Relationships,
(C) Patterns, insights and/or trends.
(D) All the above
Answer: d
33. What are the general categories of analytics that are distinguished by the results they produce:
(A) Descriptive analytics
(B) Diagnostic analytics
(C) Predictive analytics
(D) All the above
Answer: d
34. BI enables an organization to gain insight into the performance of an enterprise
(A) By analyzing data generated by its business processes and information systems.
(B) By examining data to find facts
(C) From relationships,
(D) All the above
Answer: a
35. Data variety refers to;
(A) Multiple schemas
(B) Multiple formats and types of data
(C) Multiple Data Models
(D) None of above
Answer: b
36. Unstructured Data Consists of:
(A) Text file, Audio Files.
(B) Video files, Text data
(C) Tagged Data
(D) a) and b)
Answer: d
37. Multiple internal and external data in the big data comes from the multiple sources as :
(A) Sensors, Social network sites
(B) Email, Xml, Multimedia
(C) a) and b)
(D) None of the above
Answer: c
38. Ingestion Layer Should have the capability to:
(A) validate, cleanse, transform, reduce
(B) integrate
(C) Preprocess the data
(D) a) and b)
Answer: d
39. According to analysts, for what can traditional IT systems provide a foundation when they’re integrated with big data technologies like Hadoop?
(A) Big data management and data mining
(B) Data warehousing and business intelligence
(C) Management of Hadoop clusters
(D) Collecting and storing unstructured data
Answer: a
40. What are the main components of Big Data?
(A) MapReduce
(B) HDFS
(C) YARN
(D) All of these
Answer: (d)
41. What are the different features of Big Data Analytics?
(A) Open-Source
(B) Scalability
(C) Data Recovery
(D) All the above
Answer: (d)
42. What are the four V’s of Big Data?
(A) Volume
(B) Velocity
(C) Variety
(D) All the above
Answer: (d)
43. All of the following accurately describe Hadoop, EXCEPT:
(A) Open-source
(B) Real-time
(C) Java-based
(D) Distributed computing approach
Answer: (b)
44. _____ is general-purpose computing model and runtime system for distributed data analytics.
(A) Mapreduce
(B) Drill
(C) Oozie
(D) None of the above
Answer: (a)
45. The examination of large amounts of data to see what patterns or other useful information can be found is known as
(A) Data examination
(B) Information analysis
(C) Big data analytics
(D) Data analysis
Answer: (c)
46. Big data analysis does the following except
(A) Collects data
(B) Spreads data
(C) Organizes data
(D) Analyzes data
Answer: (b)
47. What makes Big Data analysis difficult to optimize?
(A) Big Data is not difficult to optimize
(B) Both data and cost effective ways to mine data to make business sense out of it
(C) The technology to mine data
(D) All of the above
Answer: (b)
48. The new source of big data that will trigger a Big Data revolution in the years to come is
(A) Business transactions
(B) Social media
(C) Transactional data and sensor data
(D) RDBMS
Answer: (c)
49. The unit of data that flows through a Flume agent is
(A) Log
(B) Row
(C) Event
(D) Record
Answer:( c)
50. Listed below are the three steps that are followed to deploy a Big Data Solution except
(A) Data Ingestion
(B) Data Processing
(C) Data dissemination
(D) Data Storage
Answer: (c)
51. Check below the best answer to “which industries employ the use of so-called “Big Data” in their day to day operations?
(A) Weather forecasting
(B) Marketing
(C) Healthcare
(D) All of the above
Answer: (d)
52. There are almost as many bits of information in the digital universe as there are stars in the actual universe?
(A) True
(B) False
Answer: (a)
53. The word ‘Big data’ was coined by
(A) Roger Mougalas
(B) John Philips
(C) Simon Woods
(D) Martin Green
Answer: (a)
54. The word ‘Big Data’ was coined in the year
(A) 2000
(B) 1970
(C) 1998
(D) 2005
Answer: (c)
55. Concerning the Forms of Big Data, which oneof these is odd?
(A) Structured
(B) Unstructured
(C) Processed
(D) Semi-Structured
Answer: ( c )
56. Big Data applications benefit the media and entertainment industry by
(A) Predicting what the audience wants
(B) Ad targeting
(C) Scheduling optimization
(D) All of the above
Answer: (d)
57. The feature of big data that refers to the quality of the stored data is __
(A) Variety
(B) Volume
(C) Variability
(D) Veracity
Answer: (d)
58. __ is a framework for performing remote procedure calls and data serialization.
(A) Drill
(B) BigTop
(C) Avro
(D) Chukwa
Answer: c
59. Which of the following is a characteristic of Big Data?
(A) Huge volume of data
(B) Complexity of data types and structures
(C) Speed of data creation and growth
(D) All of the mentioned
Answer: d
60. Concurrent access to shared data may result in _______
(A) Data consistency
(B) Data insecurity
(C) Data inconsistency
(D) None of the mentioned
Answer: c
61. Mutual exclusion implies that :
(A) If a process is executing in its critical section, then no other process must be executing in their critical sections
(B) If a process is executing in its critical section, then other processes must be executing in their critical sections
(C) If a process is executing in its critical section, then all the resources of the system must be blocked until it finishes execution
(D) None of the mentioned
Answer: a
62. In the memory hierarchy, as the speed of operation increases the memory size also increases.
(A) True
(B) False
Answer: b
63. To use a ___network service, the service user first establishes a connection, uses the connection, and terminates the connection.
(A) Connection-oriented
(B) Connection-less
(C) Service-oriented
(D) Service-less
Answer: a
64. Which layer is responsible for the process-toprocess delivery ?
(A) Network
(B) Transport
(C) Application
(D) Physical
Answer: b
65. ______refers to the biases, noise and abnormality in data, trustworthiness of dat(A)
(A) Value
(B) Veracity
(C) Velocity
(D) Volume
Answer: b
66. _____refers to the connectedness of
big dat(A)
1. Value
2. Veracity
3. Velocity
4. Valence
Answer: d