Important Data Warehousing MCQs with Answers (Set 7) | Big Data Technology

This set of Multiple Choice Questions (MCQs) covers advanced enterprise concepts of Data Warehousing & Enterprise Modeling, including outriggers, mini-dimensions, MPP optimization, columnar databases, data skew, dynamic partition elimination, bridge tables, and enterprise-scale ETL architectures. Useful for GATE, UGC NET, University Semester Exams, and Big Data Technology preparation.

Topic: Enterprise Modeling & Advanced Scenarios | Set: 7

Difficulty: Hard | Total Questions: 14


Important Data Warehousing MCQs with Answers

Q1. What defines an “Outrigger Dimension” structure within a Kimball dimensional model?

A. A dimension table that has been completely deprecated and disconnected from the database core
B. A secondary dimension table that joins directly onto another dimension table instead of linking directly to the central fact table
C. A temporary relational view built to track system pipeline error thresholds
D. A specialized column attribute that stores binary encrypted user password hashes

View Answer & Explanation

Answer: B

Explanation: An outrigger dimension extends a dimension table through another linked dimension, slightly snowflaking the schema to support additional descriptive structures.


Q2. How should an enterprise data architect handle a “Rapidly Changing Large Dimension” (such as a profile table with millions of members undergoing constant modifications)?

A. Overwrite the historical rows using Type 1 updates and accept complete history loss
B. Break out the volatile, fast-changing attributes into a separate “Changer” or mini-dimension table referenced directly by the fact table
C. Convert the entire multi-dimensional schema into a single unindexed text document
D. Restrict the dimension table to contain only numeric measure values

View Answer & Explanation

Answer: B

Explanation: Mini-dimensions isolate rapidly changing attributes from the main dimension table, preventing uncontrolled growth while preserving historical tracking.


Q3. What problem does a “Late-Arriving Fact” present to an incremental ETL loading schedule?

A. It causes physical hardware database server motherboards to overheat during data writes
B. It demands that the ETL process insert the transaction into past historical partitions and potentially recalculate affected summary views
C. It forces the system schema to drop all associated conformed dimension constraints
D. It automatically converts character text descriptions into float metrics

View Answer & Explanation

Answer: B

Explanation: Late-arriving facts must be inserted into historical partitions, often requiring summary tables and aggregates to be recalculated.


Q4. What structural optimization occurs during a Massively Parallel Processing (MPP) data warehouse “Star Join”?

A. The query optimizer executes joins by completely replicating the central fact table across every node
B. The query engine uses bitmap indexes to pre-filter fact foreign keys in memory before executing data joins on dimensions
C. The system forces all database tables to compress into a single horizontal row
D. The query analyzer blocks multi-tenant users from performing group aggregations

View Answer & Explanation

Answer: B

Explanation: Star join optimization leverages bitmap filtering to reduce candidate fact rows before expensive joins execute, improving parallel query performance.


Q5. In a corporate data warehouse, what is a “Conformed Fact”?

A. A metric calculation that uses different business names depending on the department viewing it
B. An enterprise-wide business metric that maintains an identical definition, unit of measure, and calculation logic across all data marts
C. A dimension attribute that has been converted into an integer surrogate key field
D. A temporary system logging value that records database processing error text lines

View Answer & Explanation

Answer: B

Explanation: Conformed facts ensure consistent metric definitions across all business units, preventing contradictory reports between departments.


Q6. What design issue is created by a “Hub-and-Spoke” data warehouse architecture if data marts are built without conformed dimensions?

A. The core Enterprise Data Warehouse database will immediately delete its transaction records
B. Data silos emerge, causing different departmental data marts to return conflicting metrics for identical business questions
C. The query execution paths are forced to run exclusively on local customer computers
D. The storage hardware arrays fail to support standard data compression formats

View Answer & Explanation

Answer: B

Explanation: Without conformed dimensions, independent marts drift apart, creating inconsistent business definitions and isolated analytical silos.


Q7. How does a “Columnar Database Engine” physically optimize data warehouses compared to traditional row-based engines?

A. It organizes data records continuously on disk by columns rather than rows, minimizing disk I/O for queries that scan specific attributes
B. It entirely blocks the use of relational primary and foreign key constraints across the schema
C. It forces the entire database layout to be saved inside a single unindexed flat text string file
D. It demands that all dimension data fields utilize decimal numeric data types

View Answer & Explanation

Answer: A

Explanation: Columnar engines store similar column values together, allowing analytical queries to read only required attributes instead of full rows.


Q8. What design trap is encountered when an ETL architect allows “Loop Joins” across a Fact Constellation schema?

A. The database automatically locks down all tables to operate in read-only mode
B. Ambiguous query routing paths where the same question can yield different results depending on which join loop the query engine traverses
C. The physical hard drive sectors will automatically erase metadata dictionary logs
D. Reporting client dashboards will refuse to render basic graphical visual elements

View Answer & Explanation

Answer: B

Explanation: Loop joins create multiple traversal paths between facts and dimensions, leading to ambiguous aggregations and inconsistent query outcomes.


Q9. How should an engineering pipeline resolve structural schema evolutions (such as a source table adding new columns) inside a Kimball star schema?

A. By immediately erasing the entire historical data warehouse database asset
B. Through metadata-driven ETL architectures that dynamically adjust ingestion maps and append new non-breaking attributes to the dimensions
C. By forcing all historical text descriptions to be compressed into binary formats
D. By converting the star schema layout back into a centralized Bill Inmon normalized core

View Answer & Explanation

Answer: B

Explanation: Metadata-driven pipelines adapt dynamically to schema drift, allowing warehouses to evolve safely without breaking legacy reports.


Q10. What problem does “Data Skew” present during parallel query sorting execution inside an MPP data warehouse cluster?

A. It causes all database primary keys to become hidden from analytical dashboard interfaces
B. Uneven data distribution across parallel nodes leaves a single worker node with a bottleneck allocation while other nodes sit idle
C. The system automatically shifts from a relational model to an unindexed text directory structure
D. It completely restricts multiple administrative users from accessing metadata paths

View Answer & Explanation

Answer: B

Explanation: Data skew overloads specific processing nodes with disproportionate workloads, stalling cluster-wide execution efficiency.


Q11. What occurs when a database engine performs “Dynamic Partition Elimination”?

A. It deletes historical fact tables from the storage drive arrays to save space
B. It analyzes query parameters at runtime to dynamically prune unneeded data partitions, avoiding scanning irrelevant disk blocks
C. It transforms a snowflake schema design into a flat factless fact table layout
D. It forces all string text variables to match float numeric properties

View Answer & Explanation

Answer: B

Explanation: Dynamic partition elimination filters partitions during execution, skipping irrelevant storage blocks to reduce disk I/O overhead.


Q12. When calculating weight allocations inside a multi-valued dimension bridge table, a “Fact Scaling Factor” is used to:

A. Compress the physical file size of the central fact table records
B. Proportionately fractionate fact metrics across many-to-many dimension members, preventing artificial metric inflation when summing records
C. Encrypt historical transaction values from external reporting client visualization software
D. Track the total processing execution latency of an ETL data integration process

View Answer & Explanation

Answer: B

Explanation: Scaling factors distribute fact values proportionally across related members, preventing duplicate counting during aggregations.


Q13. What is the primary operational objective of implementing “Micro-Batch ETL” architectures?

A. Restricting pipeline data integrations to run only once every single calendar year
B. Reducing data latency by scheduling frequent, high-velocity mini-batch runs to feed near-real-time insights into the warehouse core
C. Forcing all data transformations to execute inside the browser framework of customer desktops
D. Removing primary surrogate key indexing constraints from all transaction fact tables

View Answer & Explanation

Answer: B

Explanation: Micro-batching bridges traditional batch processing and streaming systems, delivering near-real-time warehouse freshness.


Q14. Evaluated from a performance stance, why are wide fact tables with hundreds of columns (the “Star Schema Belly” anti-pattern) avoided?

A. They completely prevent the database engine from executing basic SQL join operations
B. They swell the width of rows on disk, decreasing cache efficiency and forcing unnecessary disk I/O loops during analytical attribute scans
C. They automatically delete metadata description catalogs from the staging repository
D. They convert relational tables into unindexed flat files on backup servers

View Answer & Explanation

Answer: B

Explanation: Extremely wide fact rows increase storage overhead and force queries to scan unnecessary data, hurting cache locality and performance.


Conclusion

These Data Warehousing MCQ questions covered advanced enterprise concepts such as outrigger dimensions, MPP optimization, columnar databases, dynamic partition elimination, bridge tables, micro-batch ETL, and enterprise-scale dimensional modeling.

Practicing these questions is useful for Big Data Technology, Enterprise Data Warehousing, GATE CS, UGC NET, and university semester examinations.

For better understanding of theory and concepts, refer Data Warehouse Design & Models for Data-Driven Enterprises.


What’s Next?

Prepare smarter with PrepIt 📚

Subscribe to receive new MCQ sets, exam practice questions, semester resources, and technical interview preparation updates directly in your inbox.

We don’t spam! Read our privacy policy for more info.

Leave a Reply

Your email address will not be published. Required fields are marked *