1.Question 1Which is the syntax code to split the data into 60% training data and 40% testing data? 1 pointtesting_data, training_data = data.randomSplit([40, 60]) training_data, testing_data = data.randomSplit([0.6, 0.4]) training_data, testing_data = data.randomSplit([0.4, 0.6]) testing_data, training_data = data.randomSplit([0.6, 0.4]) 2.Question 2What does a VectorAssembler do? 1 pointIt combines the individual data elements into a column. It combines a bunch of columns as a single vector column. It combines two DataFrames into one. It combines individual data elements into a row. 3.Question 3What is the primary purpose of Spark's in-memory processing capability? 1 pointTo enable real-time data stream processing To improve data ingestion performance To reduce disk-based I/O costs To support complex data transformation tasks 4.Question 4What is the role of data engineers in Spark cluster monitoring? 1 pointTo ensure the efficient running and health of the Spark cluster To troubleshoot issues related to data ingestion pipelines To optimize code and data structures for better performance To analyze and visualize data processed by Spark 5.Question 5Your goal is to predict the height of a child, given the age and the weight. Which of the following algorithms will help you achieve that? 1 pointLinear regression K-means Logistic regression RandomSplit 6.Question 6Which is the correct statement for a linear regression problem? 1 pointThere will be 1 label column, which is non-numeric and multiple numeric feature columns. There will be 1 label column, which is non-numeric and multiple non-numeric feature columns. There will be 1 label column, which is text and multiple numeric feature columns. There will be 1 label column, which is numeric and multiple numeric feature columns. 7.Question 7Which is the correct syntax to create a Spark session with application name "Test App"?1 pointspark = SparkSession.builder.appname("Test App").createSession() spark = Sparksession.builder.appName("Test App").getOrCreateSession() spark = SparkSession.builder.appname("Test App").getOrCreate spark = SparkSession.builder.appName("Test App").getOrCreate() 8.Question 8Which statement best defines Clustering using Spark ML? 1 pointIt is a supervised learning technique. It relies on predefined labels or target variables. It discovers patterns and structures based on their randomness. It is the process of grouping similar data points together into clusters. 9.Question 9Which is the correct syntax to display the columns "height" and "weight" from the dataframe named "health"? 1 pointhealth.select(["height","weight"]).show() health.selectcolumns("height","weight").show() health.show(["height","weight"]) health.show("height","weight") 10.Question 10Which statement best defines GraphFrames? 1 pointGraphFrames is an integral part of the Spark installation and need not be downloaded as a separate package. GraphFrames enables Spark to perform graph processing, run computations, and analyze standard graphs. GraphFrames does not contain any built-in algorithms; you can download them as a separate package as per your requirements. GraphFrames does not require setting a directory for checkpoints. Coursera Honor Code Learn moreI, VANKADARI SAI SREE SUSHMITHA, understand that submitting work that isn’t my own may result in permanent failure of this course or deactivation of my Coursera account.SubmitSave draftLast saved on Jul 7, 9:13 AM PDTLikeDislikeReport an issue

Question

Question

Solution

Similar Questions

Upgrade your grade with Knowee