Unit-V : Big data MCQ
1. A ____ serves as the master and there is only one NameNode per cluster.
(A) Data Node
(B) NameNode
(C) Data block
(D) Replication
Answer: (b)
2. Point out the correct statement.
(A) DataNode is the slave/worker node and holds the user data in the form of Data Blocks
(B) Each incoming file is broken into 32 MB by default
(C) Data blocks are replicated across different nodes in the cluster to ensure a low degree of fault tolerance
(D) None of the mentioned
Answer: (a)
3. HDFS works in a ____ fashion.
(A) master-worker
(B) master-slave
(C) worker/slave
(D) all of the mentioned
Answer: (a)
4. ____ NameNode is used when the Primary NameNode goes down.
(A) Rack
(B) Data
(C) Secondary
(D) None of the mentioned
Answer: (c)
5. Which of the following scenario may not be a good fit for HDFS?
(A) HDFS is not suitable for scenarios requiring multiple/simultaneous writes to the same file
(B) HDFS is suitable for storing data related to applications requiring low latency data access
(C) HDFS is suitable for storing data related to applications requiring low latency data access
(D) None of the mentioned
Answer: (a)
6. ____ is the slave/worker node and holds the user data in the form of Data Blocks.
(A) DataNode
(B) NameNode
(C) Data block
(D) Replication
Answer: (a)
7. HDFS provides a command line interface called ____ used to interact with HDFS.
(A) “HDFS Shell”
(B) “FS Shell”
(C) “DFS Shell”
(D) None of the mentioned
Answer: (b)
8. For YARN, the _____ Manager UI provides host and port information.
(A) Data Node
(B) NameNode
(C) Resource
(D) Replication
Answer: (c)
9. During start up, the _____ loads the file system state from the fsimage and the edits log file.
(A) DataNode
(B) NameNode
(C) ActionNode
(D) None of the mentioned
Answer: (b)
10. In HDFS the files cannot be
(A) read
(B) deleted
(C) executed
(D) Archived
Answer: (c)
11. Which of the following command sets the value of a particular configuration variable (key)?
(A) set -v
(B) set =
(C) set
(D) reset
Answer: (b)
12. Which of the following operator executes a shell command from the Hive shell?
(A) |
(B) !
(C) ^
(D) +
Answer: (b)
13. Hive specific commands can be run from Beeline, when the Hive ___ driver is use(D)
(A) ODBC
(B) JDBC
(C) ODBC-JDBC
(D) All of the Mentioned
Answer: Option (b)
14. Which of the following data type is supported by Hive?
(A) map
(B) record
(C) string
(D) enum
Answer: (d)
15. Avro-backed tables can simply be created by using ___ in a DDL statement.
(A) “STORED AS AVRO”
(B) “STORED AS HIVE”
(C) “STORED AS AVROHIVE”
(D) “STORED AS SERDE”
Answer: (a)
16. Types that may be null must be defined as a __ of that type and Null within Avro.
(A) Union
(B) Intersection
(C) Set
(D) All of the mentioned
Answer: (a)
17. ___ is interpolated into the quotes to correctly handle spaces within the schem(A)
(A) $SCHEMA
(B) $ROW
(C) $SCHEMASPACES
(D) $NAMESPACES
Answer: (a)
18. ____ was designed to overcome the limitations of the other Hive file formats.
(A) ORC
(B) OPC
(C) ODC
(D) None of the mentioned
Answer: (a)
19. An ORC file contains groups of row data called ____
(A) postscript
(B) stripes
(C) script
(D) none of the mentioned
Answer: (b)
20. HBase is a distributed ____ database built on top of the Hadoop file system.
(A) Column-oriented
(B) Row-oriented
(C) Tuple-oriented
(D) None of the mentioned
Answer: (a)
21. HBase is ____ defines only column families.
(A) Row Oriented
(B) Schema-less
(C) Fixed Schema
(D) All of the mentioned
Answer: (b)
22. The ___ Server assigns regions to the region servers and takes the help of Apache ZooKeeper for this task.
(A) Region
(B) Master
(C) Zookeeper
(D) All of the mentioned
Answer: (b)
23. Which of the following command provides information about the user?
(A) status
(B) version
(C) whoami
(D) user
Answer: (c)
24. ___ command fetches the contents of a row or a cell.
(A) select
(B) get
(C) put
(D) none of the mentioned
Answer: (b)
25. HBaseAdmin and ____ are the two important classes in this package that provide DDL functionalities.
(A) HTableDescriptor
(B) HDescriptor
(C) HTable
(D) HTabDescriptor
Answer: (a)
26. The minimum number of row versions to keep is configured per column family via ___
(A) HBaseDecriptor
(B) HTabDescriptor
(C) HColumnDescriptor
(D) All of the mentioned
Answer: (c)
27. HBase supports a ____ interface via Put and Result.
(A) “bytes-in/bytes-out”
(B) “bytes-in”
(C) “bytes-out”
(D) none of the mentioned
Answer: (a)
28. One supported data type that deserves special mention are ____
(A) money
(B) counters
(C) smallint
(D) tinyint
Answer: (b)
29. ____ does re-write data and pack rows into columns for certain time-periods.
(A) OpenTS
(B) OpenTSDB
(C) OpenTSD
(D) OpenDB
Answer: (b)
30. ____ command disables drops and recreates a table.
(A) drop
(B) truncate
(C) delete
(D) none of the mentioned
Answer: (b)
34. When a ___ is triggered the client receives a packet saying that the znode has change(D)
(A) event
(B) watch
(C) row
(D) value
Answer: (b)
35. The underlying client-server protocol has changed in version ___ of ZooKeeper.
(A) 2.0.0
(B) 3.0.0
(C) 4.0.0
(D) 6.0.0
Answer: (b)
36. A number of constants used in the client ZooKeeper API were renamed in order to reduce ____ collision.
(A) value
(B) namespace
(C) counter
(D) none of the mentioned
Answer: (b)
37. ZooKeeper allows distributed processes to coordinate with each other through registers, known as _____
(A) znodes
(B) hnodes
(C) vnodes
(D) rnodes
Answer: (a)
38. Zookeeper essentially mirrors the ___ functionality exposed in the Linux kernel.
(A) iread
(B) inotify
(C) iwrite
(D) icount
Answer: (b)
39. ZooKeeper’s architecture supports high ____ through redundant services.
(A) flexibility
(B) scalability
(C) availability
(D) interactivity
Answer: (c)
40. You need to have ___ installed before running ZooKeeper.
(A) Java
(B) C
(C) C++
(D) SQLGUI
Answer: (a)
41. To register a “watch” on a znode data, you need to use the ___ commands to access the current content or metadat(A)
(A) stat
(B) put
(C) receive
(D) gets
Answer: (a)
42. ___ has a design policy of using ZooKeeper only for transient dat(A)
(A) Hive
(B) Imphala
(C) Hbase
(D) Oozie
Answer: (c)
43. The ____ master will register its own address in this znode at startup, making this znode the source of truth for identifying which server is the Master.
(A) active
(B) passive
(C) region
(D) all of the mentioned
Answer: (a)
44. Pig operates in mainly how many nodes?
(A) Two
(B) Three
(C) Four
(D) Five
Answer: (a)
45. You can run Pig in batch mode using ____
(A) Pig shell command
(B) Pig scripts
(C) Pig options
(D) All of the mentioned
Answer: (b)
46. Which of the following function is used to read data in PIG?
(A) WRITE
(B) READ
(C) LOAD
(D) None of the mentioned
Answer:(c)
47. You can run Pig in interactive mode using the __ shell.
(A) Grunt
(B) FS
(C) HDFS
(D) None of the mentioned
Answer: (a)
48. Which of the following is the default mode?
(A) Mapreduce
(B) Tez
(C) Local
(D) All of the mentioned
Answer: (a)
49. ____ is a platform for constructing data flows for extract, transform, and load (ETL) processing and analysis of large datasets.
(A) Pig Latin
(B) Oozie
(C) Pig
(D) Hive
Answer: (c)
50. Hive also support custom extensions written in :
(A) C
(B) C++
(C) C#
(D) Java
Answer: (d)
51. Which of the following is not true about Pig?
(A) Apache Pig is an abstraction over MapReduce
(B)Pig can not perform all the data manipulation operations in Hadoop.
(C) Pig is a tool/platform which is used to analyze larger sets of data representing them as data flows.
(D) None of the above
Ans : b
52. Which of the following is/are a feature of Pig?
(A) Rich set of operators
(B)Ease of programming
(C) Extensibility
(D) All of the above
Ans : d
53. In which year apache Pig was released?
(A) 2005
(B)2006
(C) 2007
(D) 2008
Ans : b
54. Pig operates in mainly how many nodes?
(A) 2
(B) 3
(C) 4
(D) 5
Ans : a
55. Which of the following company has developed PIG?
(A) Google
(B)Yahoo
(C) Microsoft
(D) Apple
Ans : b
56. Which of the following function is used to read data in PIG?
(A) Write
(B)Read
(C) Perform
(D)Load
Ans : d
57. ____ is a framework for collecting and storing script-level statistics for Pig Latin.
(A) Pig Stats
(B) PStatistics
(C) Pig Statistics
(D) All of the above
Ans : c
58. Which of the following is true statement?
(A) Pig is a high level language.
(B) Performing a Join operation in Apache Pig is pretty simple.
(C) Apache Pig is a data flow language.
(D) All of the above
Ans : d
59. Which of the following will compile the Pigunit?
(A) $pig_trunk ant pigunit-jar
(B) $pig_tr ant pigunit-jar
(C) $pig_ ant pigunit-jar
(D) $pigtr_ ant pigunit-jar
Ans : a
60. Point out the wrong statement.
(A) Pig can invoke code in language like Java Only
(B) Pig enables data workers to write complex data transformations without knowing Java
(C) Pig’s simple SQL-like scripting language is called Pig Latin, and appeals to developers already familiar with scripting languages and SQL
(D) Pig is complete, so you can do all required data manipulations in Apache Hadoop with Pig
Ans : a
61. You can run Pig in interactive mode using the____ shell
(A)Grunt
(B) FS
(C) HDFS
(D) None of the mentioned
Ans : a
62. Which of the following is the default mode?
(A) Mapreduce
(B)Tez
(C) Local
(D)All of the mentioned
Ans : d
63. Use the ____ command to run a Pig script that can interact with the Grunt shell (interactive mode)
(A) fetch
(B) declare
(C) run
(D) all of the mentioned
Ans : c
64. What are the different complex data types in PIG
(A) Maps
(B)Tuples
(C) Bags
(D)All of these
Answer: d
65. What are the various diagnostic operators available in Apache Pig?
(A) Dump Operator
(B) Describe Operator
(C) Explain Operator
(D)All of these
66. If data has less elements than the specified schema elements in pig, then?
(A) Pig will not do any thing
(B)It will pad the end of the record columns with nulls
(C) Pig will through error
(D) Pig will warn you before it throws error
Answer: b
67. Which of the following command sets the value of a particular configuration variable (key)?
set -v
set =
set
reset
Answer: b
68. Point out the correct statement.
(A) Hive Commands are non-SQL statement such as setting a property or adding a resource
(B) Set -v prints a list of configuration variables that are overridden by the user or Hive
(C) Set sets a list of variables that are overridden by the user or Hive
(D) None of the mentioned
Answer: a
69. Which of the following will remove the resource(s) from the distributed cache?
delete FILE[S] *
delete JAR[S] *
delete ARCHIVE[S] *
all of the mentioned
Answer: d
70. ___ is a shell utility which can be used to run Hive queries in either interactive or batch mode.
(A) $HIVE/bin/hive
(B) $HIVE_HOME/hive
(C) $HIVE_HOME/bin/hive
(D) All of the mentioned
Answer: c
71. HiveServer2 introduced in Hive 0.11 has a new CLI called ____
(A) BeeLine
(B) SqlLine
(C) HiveLine
(D) CLilLine
Answer: a
72. Variable Substitution is disabled by using _____
(A) set hive.variable.substitute=false;
(B) set hive.variable.substitutevalues=false;
(C) set hive.variable.substitute=true;
(D) all of the mentioned
Answer: a
73. ___ supports a new command shell Beeline that works with HiveServer2.
(A) HiveServer2
(B) HiveServer3
(C) HiveServer4
(D) None of the mentioned
Answer: a
74. In __ mode HiveServer2 only accepts valid Thrift calls.
(A) Remote
(B) HTTP
(C) Embedded
(D) Interactive
Answer: a
75. The Hbase tables are
(A) Made read only by setting the read-only option
(B) Always writeable
(C) Always read-only
(D) Are made read only using the query to the
Answer: a
76. Every row in a Hbase table has
(A)Same number of columns
(B)Same number of column families
(C)Different number of columns
(D)Different number of column families
Answer: d
77. Hbase creates a new version of a record during
(A) Creation of a record
(B)Modification of a record
(C) Deletion of a record
(D)All the above
Answer: d
78. HBaseAdmin and ____ are the two important classes in this package that provide DDL functionalities.
(A)HTableDescriptor
(B) HDescriptor
(C) HTable
(D) HTabDescriptor
Answer: a
79. Mention how many operational commands in Hbase?
(A) Get
(B) Put
(C) Delete
(D) All of the mentioned
Answer: d
80. The ___ Server assigns regions to the region servers and takes the help of Apache ZooKeeper for this task.
(A) Region
(B)Master
(C) Zookeeper
(D)All of the mentioned