Google Professional Cloud DevOps Engineer Practice Exam Free – 50 Questions to Simulate the Real Exam
Are you getting ready for the Google Professional Cloud DevOps Engineer certification? Take your preparation to the next level with our Google Professional Cloud DevOps Engineer Practice Exam Free – a carefully designed set of 50 realistic exam-style questions to help you evaluate your knowledge and boost your confidence.
Using a Google Professional Cloud DevOps Engineer practice exam free is one of the best ways to:
- Experience the format and difficulty of the real exam
- Identify your strengths and focus on weak areas
- Improve your test-taking speed and accuracy
Below, you will find 50 realistic Google Professional Cloud DevOps Engineer practice exam free questions covering key exam topics. Each question reflects the structure and challenge of the actual exam.
You use a multiple step Cloud Build pipeline to build and deploy your application to Google Kubernetes Engine (GKE). You want to integrate with a third-party monitoring platform by performing a HTTP POST of the build information to a webhook. You want to minimize the development effort. What should you do?
A. Add logic to each Cloud Build step to HTTP POST the build information to a webhook.
B. Add a new step at the end of the pipeline in Cloud Build to HTTP POST the build information to a webhook.
C. Use Stackdriver Logging to create a logs-based metric from the Cloud Build logs. Create an Alert with a Webhook notification type.
D. Create a Cloud Pub/Sub push subscription to the Cloud Build cloud-builds PubSub topic to HTTP POST the build information to a webhook.
You are running an application in a virtual machine (VM) using a custom Debian image. The image has the Stackdriver Logging agent installed. The VM has the cloud-platform scope. The application is logging information via syslog. You want to use Stackdriver Logging in the Google Cloud Platform Console to visualize the logs. You notice that syslog is not showing up in the "All logs" dropdown list of the Logs Viewer. What is the first thing you should do?
A. Look for the agent’s test log entry in the Logs Viewer.
B. Install the most recent version of the Stackdriver agent.
C. Verify the VM service account access scope includes the monitoring.write scope.
D. SSH to the VM and execute the following commands on your VM: ps ax | grep fluentd.
Your company runs services by using Google Kubernetes Engine (GKE). The GKE dusters in the development environment run applications with verbose logging enabled. Developers view logs by using the kubectl logs command and do not use Cloud Logging. Applications do not have a uniform logging structure defined. You need to minimize the costs associated with application logging while still collecting GKE operational logs. What should you do?
A. Run the gcloud container clusters update –logging=SYSTEM command for the development cluster.
B. Run the gcloud container clusters update –logging=WORKLOAD command for the development cluster.
C. Run the gcloud logging sinks update _Default –disabled command in the project associated with the development environment.
D. Add the severity >= DEBUG resource.type = “k8s_container” exclusion filter to the _Default logging sink in the project associated with the development environment.
Your development team has created a new version of their service's API. You need to deploy the new versions of the API with the least disruption to third-party developers and end users of third-party installed applications. What should you do?
A. Introduce the new version of the API. Announce deprecation of the old version of the API. Deprecate the old version of the API. Contact remaining users of the old API. Provide best effort support to users of the old API. Turn down the old version of the API.
B. Announce deprecation of the old version of the API. Introduce the new version of the API. Contact remaining users on the old API. Deprecate the old version of the API. Turn down the old version of the API. Provide best effort support to users of the old API.
C. Announce deprecation of the old version of the API. Contact remaining users on the old API. Introduce the new version of the API. Deprecate the old version of the API. Provide best effort support to users of the old API. Turn down the old version of the API.
D. Introduce the new version of the API. Contact remaining users of the old API. Announce deprecation of the old version of the API. Deprecate the old version of the API. Turn down the old version of the API. Provide best effort support to users of the old API.
Your company runs applications in Google Kubernetes Engine (GKE). Several applications rely on ephemeral volumes. You noticed some applications were unstable due to the DiskPressure node condition on the worker nodes. You need to identify which Pods are causing the issue, but you do not have execute access to workloads and nodes. What should you do?
A. Check the node/ephemeral_storage/used_bytes metric by using Metrics Explorer.
B. Check the container/ephemeral_storage/used_bytes metric by using Metrics Explorer.
C. Locate all the Pods with emptyDir volumes. Use the df -h command to measure volume disk usage.
D. Locate all the Pods with emptyDir volumes. Use the df -sh * command to measure volume disk usage.
You use Terraform to manage an application deployed to a Google Cloud environment. The application runs on instances deployed by a managed instance group. The Terraform code is deployed by using a CI/CD pipeline. When you change the machine type on the instance template used by the managed instance group, the pipeline fails at the terraform apply stage with the following error message:You need to update the instance template and minimize disruption to the application and the number of pipeline runs. What should you do?
A. Delete the managed instance group, and recreate it after updating the instance template.
B. Add a new instance template, update the managed instance group to use the new instance template, and delete the old instance template.
C. Remove the managed instance group from the Terraform state file, update the instance template, and reimport the managed instance group.
D. Set the create_before_destroy meta-argument to true in the lifecycle block on the instance template.
You are building and running client applications in Cloud Run and Cloud Functions. Your client requires that all logs must be available for one year so that the client can import the logs into their logging service. You must minimize required code changes. What should you do?
A. Update all images in Cloud Run and all functions in Cloud Functions to send logs to both Cloud Logging and the client’s logging service. Ensure that all the ports required to send logs are open in the VPC firewall.
B. Create a Pub/Sub topic, subscription, and logging sink. Configure the logging sink to send all logs into the topic. Give your client access to the topic to retrieve the logs.
C. Create a storage bucket and appropriate VPC firewall rules. Update all images in Cloud Run and all functions in Cloud Functions to send logs to a file within the storage bucket.
D. Create a logs bucket and logging sink. Set the retention on the logs bucket to 365 days. Configure the logging sink to send logs to the bucket. Give your client access to the bucket to retrieve the logs.
You support a user-facing web application. When analyzing the application's error budget over the previous six months, you notice that the application has never consumed more than 5% of its error budget in any given time window. You hold a Service Level Objective (SLO) review with business stakeholders and confirm that the SLO is set appropriately. You want your application's SLO to more closely reflect its observed reliability. What steps can you take to further that goal while balancing velocity, reliability, and business needs? (Choose two.)
A. Add more serving capacity to all of your application’s zones.
B. Have more frequent or potentially risky application releases.
C. Tighten the SLO match the application’s observed reliability.
D. Implement and measure additional Service Level Indicators (SLIs) fro the application.
E. Announce planned downtime to consume more error budget, and ensure that users are not depending on a tighter SLO.
You support a production service that runs on a single Compute Engine instance. You regularly need to spend time on recreating the service by deleting the crashing instance and creating a new instance based on the relevant image. You want to reduce the time spent performing manual operations while following Site Reliability Engineering principles. What should you do?
A. File a bug with the development team so they can find the root cause of the crashing instance.
B. Create a Managed instance Group with a single instance and use health checks to determine the system status.
C. Add a Load Balancer in front of the Compute Engine instance and use health checks to determine the system status.
D. Create a Stackdriver Monitoring dashboard with SMS alerts to be able to start recreating the crashed instance promptly after it was crashed.
You are configuring the frontend tier of an application deployed in Google Cloud. The frontend tier is hosted in nginx and deployed using a managed instance group with an Envoy-based external HTTP(S) load balancer in front. The application is deployed entirely within the europe-west2 region, and only serves users based in the United Kingdom. You need to choose the most cost-effective network tier and load balancing configuration. What should you use?
A. Premium Tier with a global load balancer
B. Premium Tier with a regional load balancer
C. Standard Tier with a global load balancer
D. Standard Tier with a regional load balancer
A third-party application needs to have a service account key to work properly. When you try to export the key from your cloud project, you receive an error: “The organization policy constraint iam.disableServiceAccounKeyCreation is enforced.” You need to make the third-party application work while following Google-recommended security practices. What should you do?
A. Enable the default service account key, and download the key.
B. Remove the iam.disableServiceAccountKeyCreation policy at the organization level, and create a key.
C. Disable the service account key creation policy at the project’s folder, and download the default key.
D. Add a rule to set the iam.disableServiceAccountKeyCreation policy to off in your project, and create a key.
You encountered a major service outage that affected all users of the service for multiple hours. After several hours of incident management, the service returned to normal, and user access was restored. You need to provide an incident summary to relevant stakeholders following the Site Reliability Engineering recommended practices. What should you do first?
A. Call individual stakeholders to explain what happened.
B. Develop a post-mortem to be distributed to stakeholders.
C. Send the Incident State Document to all the stakeholders.
D. Require the engineer responsible to write an apology email to all stakeholders.
You have deployed a fleet of Compute Engine instances in Google Cloud. You need to ensure that monitoring metrics and logs for the instances are visible in Cloud Logging and Cloud Monitoring by your company's operations and cyber security teams. You need to grant the required roles for the Compute Engine service account by using Identity and Access Management (IAM) while following the principle of least privilege. What should you do?
A. Grant the logging.logWriter and monitoring.metricWriter roles to the Compute Engine service accounts.
B. Grant the logging.admin and monitoring.editor roles to the Compute Engine service accounts.
C. Grant the logging.editor and monitoring.metricWriter roles to the Compute Engine service accounts.
D. Grant the logging.logWriter and monitoring.editor roles to the Compute Engine service accounts.
Your product is currently deployed in three Google Cloud Platform (GCP) zones with your users divided between the zones. You can fail over from one zone to another, but it causes a 10-minute service disruption for the affected users. You typically experience a database failure once per quarter and can detect it within five minutes. You are cataloging the reliability risks of a new real-time chat feature for your product. You catalog the following information for each risk: * Mean Time to Detect (MTTD) in minutes * Mean Time to Repair (MTTR) in minutes * Mean Time Between Failure (MTBF) in days * User Impact Percentage The chat feature requires a new database system that takes twice as long to successfully fail over between zones. You want to account for the risk of the new database failing in one zone. What would be the values for the risk of database failover with the new system?
A. MTTD: 5 MTTR: 10 MTBF: 90 Impact: 33%
B. MTTD: 5 MTTR: 20 MTBF: 90 Impact: 33%
C. MTTD: 5 MTTR: 10 MTBF: 90 Impact: 50%
D. MTTD: 5 MTTR: 20 MTBF: 90 Impact: 50%
You support a high-traffic web application with a microservice architecture. The home page of the application displays multiple widgets containing content such as the current weather, stock prices, and news headlines. The main serving thread makes a call to a dedicated microservice for each widget and then lays out the homepage for the user. The microservices occasionally fail; when that happens, the serving thread serves the homepage with some missing content. Users of the application are unhappy if this degraded mode occurs too frequently, but they would rather have some content served instead of no content at all. You want to set a Service Level Objective (SLO) to ensure that the user experience does not degrade too much. What Service Level Indicator (SLI) should you use to measure this?
A. A quality SLI: the ratio of non-degraded responses to total responses.
B. An availability SLI: the ratio of healthy microservices to the total number of microservices.
C. A freshness SLI: the proportion of widgets that have been updated within the last 10 minutes.
D. A latency SLI: the ratio of microservice calls that complete in under 100 ms to the total number of microservice calls.
You are leading a DevOps project for your organization. The DevOps team is responsible for managing the service infrastructure and being on-call for incidents. The Software Development team is responsible for writing, submitting, and reviewing code. Neither team has any published SLOs. You want to design a new joint-ownership model for a service between the DevOps team and the Software Development team. Which responsibilities should be assigned to each team in the new joint-ownership model?![]()
![]()
![]()
![]()
You support an application running on App Engine. The application is used globally and accessed from various device types. You want to know the number of connections. You are using Stackdriver Monitoring for App Engine. What metric should you use?
A. flex/connections/current
B. tcp_ssl_proxy/new_connections
C. tcp_ssl_proxy/open_connections
D. flex/instance/connections/current
Your company follows Site Reliability Engineering principles. You are writing a postmortem for an incident, triggered by a software change, that severely affected users. You want to prevent severe incidents from happening in the future. What should you do?
A. Identify engineers responsible for the incident and escalate to their senior management.
B. Ensure that test cases that catch errors of this type are run successfully before new software releases.
C. Follow up with the employees who reviewed the changes and prescribe practices they should follow in the future.
D. Design a policy that will require on-call teams to immediately call engineers and management to discuss a plan of action if an incident occurs.
You are building and deploying a microservice on Cloud Run for your organization. Your service is used by many applications internally. You are deploying a new release, and you need to test the new version extensively in the staging and production environments. You must minimize user and developer impact. What should you do?
A. Deploy the new version of the service to the staging environment. Split the traffic, and allow 1% of traffic through to the latest version. Test the latest version. If the test passes, gradually roll out the latest version to the staging and production environments.
B. Deploy the new version of the service to the staging environment. Split the traffic, and allow 50% of traffic through to the latest version. Test the latest version. If the test passes, send all traffic to the latest version. Repeat for the production environment.
C. Deploy the new version of the service to the staging environment with a new-release tag without serving traffic. Test the new-release version. If the test passes, gradually roll out this tagged version. Repeat for the production environment.
D. Deploy a new environment with the green tag to use as the staging environment. Deploy the new version of the service to the green environment and test the new version. If the tests pass, send all traffic to the green environment and delete the existing staging environment. Repeat for the production environment.
Your organization uses a change advisory board (CAB) to approve all changes to an existing service. You want to revise this process to eliminate any negative impact on the software delivery performance. What should you do? (Choose two.)
A. Replace the CAB with a senior manager to ensure continuous oversight from development to deployment.
B. Let developers merge their own changes, but ensure that the team’s deployment platform can roll back changes if any issues are discovered.
C. Move to a peer-review based process for individual changes that is enforced at code check-in time and supported by automated tests.
D. Batch changes into larger but less frequent software releases.
E. Ensure that the team’s development platform enables developers to get fast feedback on the impact of their changes.
You are using Stackdriver to monitor applications hosted on Google Cloud Platform (GCP). You recently deployed a new application, but its logs are not appearing on the Stackdriver dashboard. You need to troubleshoot the issue. What should you do?
A. Confirm that the Stackdriver agent has been installed in the hosting virtual machine.
B. Confirm that your account has the proper permissions to use the Stackdriver dashboard.
C. Confirm that port 25 has been opened in the firewall to allow messages through to Stackdriver.
D. Confirm that the application is using the required client library and the service account key has proper permissions.
Your company uses Jenkins running on Google Cloud VM instances for CI/CD. You need to extend the functionality to use infrastructure as code automation by using Terraform. You must ensure that the Terraform Jenkins instance is authorized to create Google Cloud resources. You want to follow Google-recommended practices. What should you do?
A. Confirm that the Jenkins VM instance has an attached service account with the appropriate Identity and Access Management (IAM) permissions.
B. Use the Terraform module so that Secret Manager can retrieve credentials.
C. Create a dedicated service account for the Terraform instance. Download and copy the secret key value to the GOOGLE_CREDENTIALS environment variable on the Jenkins server.
D. Add the gcloud auth application-default login command as a step in Jenkins before running the Terraform commands.
You support a stateless web-based API that is deployed on a single Compute Engine instance in the europe-west2-a zone. The Service Level Indicator (SLI) for service availability is below the specified Service Level Objective (SLO). A postmortem has revealed that requests to the API regularly time out. The time outs are due to the API having a high number of requests and running out memory. You want to improve service availability. What should you do?
A. Change the specified SLO to match the measured SLI
B. Move the service to higher-specification compute instances with more memory
C. Set up additional service instances in other zones and load balance the traffic between all instances
D. Set up additional service instances in other zones and use them as a failover in case the primary instance is unavailable
You are working with a government agency that requires you to archive application logs for seven years. You need to configure Stackdriver to export and store the logs while minimizing costs of storage. What should you do?
A. Create a Cloud Storage bucket and develop your application to send logs directly to the bucket.
B. Develop an App Engine application that pulls the logs from Stackdriver and saves them in BigQuery.
C. Create an export in Stackdriver and configure Cloud Pub/Sub to store logs in permanent storage for seven years.
D. Create a sink in Stackdriver, name it, create a bucket on Cloud Storage for storing archived logs, and then select the bucket as the log export destination.
Your application runs on Google Cloud Platform (GCP). You need to implement Jenkins for deploying application releases to GCP. You want to streamline the release process, lower operational toil, and keep user data secure. What should you do?
A. Implement Jenkins on local workstations.
B. Implement Jenkins on Kubernetes on-premises.
C. Implement Jenkins on Google Cloud Functions.
D. Implement Jenkins on Compute Engine virtual machines.
You work for a global organization and run a service with an availability target of 99% with limited engineering resources. For the current calendar month, you noticed that the service has 99.5% availability. You must ensure that your service meets the defined availability goals and can react to business changes, including the upcoming launch of new features. You also need to reduce technical debt while minimizing operational costs. You want to follow Google-recommended practices. What should you do?
A. Add N+1 redundancy to your service by adding additional compute resources to the service.
B. Identify, measure, and eliminate toil by automating repetitive tasks.
C. Define an error budget for your service level availability and minimize the remaining error budget.
D. Allocate available engineers to the feature backlog while you ensure that the service remains within the availability target.
Your organization wants to implement Site Reliability Engineering (SRE) culture and principles. Recently, a service that you support had a limited outage. A manager on another team asks you to provide a formal explanation of what happened so they can action remediations. What should you do?
A. Develop a postmortem that includes the root causes, resolution, lessons learned, and a prioritized list of action items. Share it with the manager only.
B. Develop a postmortem that includes the root causes, resolution, lessons learned, and a prioritized list of action items. Share it on the engineering organization’s document portal.
C. Develop a postmortem that includes the root causes, resolution, lessons learned, the list of people responsible, and a list of action items for each person. Share it with the manager only.
D. Develop a postmortem that includes the root causes, resolution, lessons learned, the list of people responsible, and a list of action items for each person. Share it on the engineering organization’s document portal.
You are creating and assigning action items in a postmodern for an outage. The outage is over, but you need to address the root causes. You want to ensure that your team handles the action items quickly and efficiently. How should you assign owners and collaborators to action items?
A. Assign one owner for each action item and any necessary collaborators.
B. Assign multiple owners for each item to guarantee that the team addresses items quickly.
C. Assign collaborators but no individual owners to the items to keep the postmortem blameless.
D. Assign the team lead as the owner for all action items because they are in charge of the SRE team.
Your application's performance in Google Cloud has degraded since the last release. You suspect that downstream dependencies might be causing some requests to take longer to complete. You need to investigate the issue with your application to determine the cause. What should you do?
A. Configure Error Reporting in your application.
B. Configure Google Cloud Managed Service for Prometheus in your application.
C. Configure Cloud Profiler in your application.
D. Configure Cloud Trace in your application.
You are running a real-time gaming application on Compute Engine that has a production and testing environment. Each environment has their own Virtual Private Cloud (VPC) network. The application frontend and backend servers are located on different subnets in the environment's VPC. You suspect there is a malicious process communicating intermittently in your production frontend servers. You want to ensure that network traffic is captured for analysis. What should you do?
A. Enable VPC Flow Logs on the production VPC network frontend and backend subnets only with a sample volume scale of 0.5.
B. Enable VPC Flow Logs on the production VPC network frontend and backend subnets only with a sample volume scale of 1.0.
C. Enable VPC Flow Logs on the testing and production VPC network frontend and backend subnets with a volume scale of 0.5. Apply changes in testing before production.
D. Enable VPC Flow Logs on the testing and production VPC network frontend and backend subnets with a volume scale of 1.0. Apply changes in testing before production.
You have an application that runs on Cloud Run. You want to use live production traffic to test a new version of the application, while you let the quality assurance team perform manual testing. You want to limit the potential impact of any issues while testing the new version, and you must be able to roll back to a previous version of the application if needed. How should you deploy the new version? (Choose two.)
A. Deploy the application as a new Cloud Run service.
B. Deploy a new Cloud Run revision with a tag and use the –no-traffic option.
C. Deploy a new Cloud Run revision without a tag and use the –no-traffic option.
D. Deploy the new application version and use the –no-traffic option. Route production traffic to the revision’s URL.
E. Deploy the new application version, and split traffic to the new version.
You are managing an application that exposes an HTTP endpoint without using a load balancer. The latency of the HTTP responses is important for the user experience. You want to understand what HTTP latencies all of your users are experiencing. You use Stackdriver Monitoring. What should you do?
A. In your application, create a metric with a metricKind set to DELTA and a valueType set to DOUBLE. ג€¢ In Stackdriver’s Metrics Explorer, use a Stacked Bar graph to visualize the metric.
B. In your application, create a metric with a metricKind set to CUMULATIVE and a valueType set to DOUBLE. ג€¢ In Stackdriver’s Metrics Explorer, use a Line graph to visualize the metric.
C. In your application, create a metric with a metricKind set to GAUGE and a valueType set to DISTRIBUTION. ג€¢ In Stackdriver’s Metrics Explorer, use a Heatmap graph to visualize the metric.
D. In your application, create a metric with a metricKind set to METRIC_KIND_UNSPECIFIED and a valueType set to INT64. ג€¢ In Stackdriver’s Metrics Explorer, use a Stacked Area graph to visualize the metric.
You are configuring Cloud Logging for a new application that runs on a Compute Engine instance with a public IP address. A user-managed service account is attached to the instance. You confirmed that the necessary agents are running on the instance but you cannot see any log entries from the instance in Cloud Logging. You want to resolve the issue by following Google-recommended practices. What should you do?
A. Export the service account key and configure the agents to use the key.
B. Update the instance to use the default Compute Engine service account.
C. Add the Logs Writer role to the service account.
D. Enable Private Google Access on the subnet that the instance is in.
You support a service that recently had an outage. The outage was caused by a new release that exhausted the service memory resources. You rolled back the release successfully to mitigate the impact on users. You are now in charge of the post-mortem for the outage. You want to follow Site Reliability Engineering practices when developing the post-mortem. What should you do?
A. Focus on developing new features rather than avoiding the outages from recurring.
B. Focus on identifying the contributing causes of the incident rather than the individual responsible for the cause.
C. Plan individual meetings with all the engineers involved. Determine who approved and pushed the new release to production.
D. Use the Git history to find the related code commit. Prevent the engineer who made that commit from working on production services.
You are building the CI/CD pipeline for an application deployed to Google Kubernetes Engine (GKE). The application is deployed by using a Kubernetes Deployment, Service, and Ingress. The application team asked you to deploy the application by using the blue/green deployment methodology. You need to implement the rollback actions. What should you do?
A. Run the kubectl rollout undo command.
B. Delete the new container image, and delete the running Pods.
C. Update the Kubernetes Service to point to the previous Kubernetes Deployment.
D. Scale the new Kubernetes Deployment to zero.
Your application artifacts are being built and deployed via a CI/CD pipeline. You want the CI/CD pipeline to securely access application secrets. You also want to more easily rotate secrets in case of a security breach. What should you do?
A. Prompt developers for secrets at build time. Instruct developers to not store secrets at rest.
B. Store secrets in a separate configuration file on Git. Provide select developers with access to the configuration file.
C. Store secrets in Cloud Storage encrypted with a key from Cloud KMS. Provide the CI/CD pipeline with access to Cloud KMS via IAM.
D. Encrypt the secrets and store them in the source code repository. Store a decryption key in a separate repository and grant your pipeline access to it.
You support a popular mobile game application deployed on Google Kubernetes Engine (GKE) across several Google Cloud regions. Each region has multiple Kubernetes clusters. You receive a report that none of the users in a specific region can connect to the application. You want to resolve the incident while following Site Reliability Engineering practices. What should you do first?
A. Reroute the user traffic from the affected region to other regions that don’t report issues.
B. Use Stackdriver Monitoring to check for a spike in CPU or memory usage for the affected region.
C. Add an extra node pool that consists of high memory and high CPU machine type instances to the cluster.
D. Use Stackdriver Logging to filter on the clusters in the affected region, and inspect error messages in the logs.
You use Cloud Build to build and deploy your application. You want to securely incorporate database credentials and other application secrets into the build pipeline. You also want to minimize the development effort. What should you do?
A. Create a Cloud Storage bucket and use the built-in encryption at rest. Store the secrets in the bucket and grant Cloud Build access to the bucket.
B. Encrypt the secrets and store them in the application repository. Store a decryption key in a separate repository and grant Cloud Build access to the repository.
C. Use client-side encryption to encrypt the secrets and store them in a Cloud Storage bucket. Store a decryption key in the bucket and grant Cloud Build access to the bucket.
D. Use Cloud Key Management Service (Cloud KMS) to encrypt the secrets and include them in your Cloud Build deployment configuration. Grant Cloud Build access to the KeyRing.
You support a trading application written in Python and hosted on App Engine flexible environment. You want to customize the error information being sent to Stackdriver Error Reporting. What should you do?
A. Install the Stackdriver Error Reporting library for Python, and then run your code on a Compute Engine VM.
B. Install the Stackdriver Error Reporting library for Python, and then run your code on Google Kubernetes Engine.
C. Install the Stackdriver Error Reporting library for Python, and then run your code on App Engine flexible environment.
D. Use the Stackdriver Error Reporting API to write errors from your application to ReportedErrorEvent, and then generate log entries with properly formatted error messages in Stackdriver Logging.
You have an application running in Google Kubernetes Engine. The application invokes multiple services per request but responds too slowly. You need to identify which downstream service or services are causing the delay. What should you do?
A. Analyze VPC flow logs along the path of the request.
B. Investigate the Liveness and Readiness probes for each service.
C. Create a Dataflow pipeline to analyze service metrics in real time.
D. Use a distributed tracing framework such as OpenTelemetry or Stackdriver Trace.
Your organization wants to increase the availability target of an application from 99.9% to 99.99% for an investment of $2,000. The application's current revenue is $1,000,000. You need to determine whether the increase in availability is worth the investment for a single year of usage. What should you do?
A. Calculate the value of improved availability to be $900, and determine that the increase in availability is not worth the investment.
B. Calculate the value of improved availability to be $1,000, and determine that the increase in availability is not worth the investment.
C. Calculate the value of improved availability to be $1,000, and determine that the increase in availability is worth the investment.
D. Calculate the value of improved availability to be $9,000, and determine that the increase in availability is worth the investment.
Your organization has a containerized web application that runs on-premises. As part of the migration plan to Google Cloud, you need to select a deployment strategy and platform that meets the following acceptance criteria: 1. The platform must be able to direct traffic from Android devices to an Android-specific microservice. 2. The platform must allow for arbitrary percentage-based traffic splitting 3. The deployment strategy must allow for continuous testing of multiple versions of any microservice. What should you do?
A. Deploy the canary release of the application to Cloud Run. Use traffic splitting to direct 10% of user traffic to the canary release based on the revision tag.
B. Deploy the canary release of the application to App Engine. Use traffic splitting to direct a subset of user traffic to the new version based on the IP address.
C. Deploy the canary release of the application to Compute Engine. Use Anthos Service Mesh with Compute Engine to direct 10% of user traffic to the canary release by configuring the virtual service.
D. Deploy the canary release to Google Kubernetes Engine with Anthos Service Mesh. Use traffic splitting to direct 10% of user traffic to the new version based on the user-agent header configured in the virtual service.
You are analyzing Java applications in production. All applications have Cloud Profiler and Cloud Trace installed and configured by default. You want to determine which applications need performance tuning. What should you do? (Choose two.)
A. Examine the wall-clock time and the CPU time of the application. If the difference is substantial increase the CPU resource allocation.
B. Examine the wall-clock time and the CPU time of the application. If the difference is substantial, increase the memory resource allocation.
C. Examine the wall-clock time and the CPU time of the application. If the difference is substantial, increase the local disk storage allocation.
D. Examine the latency time the wall-clock time and the CPU time of the application. If the latency time is slowly burning down the error budget, and the difference between wall-clock time and CPU time is minimal mark the application for optimization.
E. Examine the heap usage of the application. If the usage is low, mark the application for optimization.
You support the backend of a mobile phone game that runs on a Google Kubernetes Engine (GKE) cluster. The application is serving HTTP requests from users. You need to implement a solution that will reduce the network cost. What should you do?
A. Configure the VPC as a Shared VPC Host project.
B. Configure your network services on the Standard Tier.
C. Configure your Kubernetes cluster as a Private Cluster.
D. Configure a Google Cloud HTTP Load Balancer as Ingress.
As a Site Reliability Engineer, you support an application written in Go that runs on Google Kubernetes Engine (GKE) in production. After releasing a new version of the application, you notice the application runs for about 15 minutes and then restarts. You decide to add Cloud Profiler to your application and now notice that the heap usage grows constantly until the application restarts. What should you do?
A. Increase the CPU limit in the application deployment.
B. Add high memory compute nodes to the cluster.
C. Increase the memory limit in the application deployment.
D. Add Cloud Trace to the application, and redeploy.
You need to introduce postmortems into your organization. You want to ensure that the postmortem process is well received. What should you do? (Choose two.)
A. Encourage new employees to conduct postmortems to team through practice.
B. Create a designated team that is responsible for conducting all postmortems.
C. Encourage your senior leadership to acknowledge and participate in postmortems.
D. Ensure that writing effective postmortems is a rewarded and celebrated practice.
E. Provide your organization with a forum to critique previous postmortems.
You are the Site Reliability Engineer responsible for managing your company's data services and products. You regularly navigate operational challenges, such as unpredictable data volume and high cost, with your company's data ingestion processes. You recently learned that a new data ingestion product will be developed in Google Cloud. You need to collaborate with the product development team to provide operational input on the new product. What should you do?
A. Deploy the prototype product in a test environment, run a load test, and share the results with the product development team.
B. When the initial product version passes the quality assurance phase and compliance assessments, deploy the product to a staging environment. Share error logs and performance metrics with the product development team.
C. When the new product is used by at least one internal customer in production, share error logs and monitoring metrics with the product development team.
D. Review the design of the product with the product development team to provide feedback early in the design phase.
You are deploying an application to Cloud Run. The application requires a password to start. Your organization requires that all passwords are rotated every 24 hours, and your application must have the latest password. You need to deploy the application with no downtime. What should you do?
A. Store the password in Secret Manager and send the secret to the application by using environment variables.
B. Store the password in Secret Manager and mount the secret as a volume within the application.
C. Use Cloud Build to add your password into the application container at build time. Ensure that Artifact Registry is secured from public access.
D. Store the password directly in the code. Use Cloud Build to rebuild and deploy the application each time the password changes.
You are reviewing your deployment pipeline in Google Cloud Deploy. You must reduce toil in the pipeline, and you want to minimize the amount of time it takes to complete an end-to-end deployment. What should you do? (Choose two.)
A. Create a trigger to notify the required team to complete the next step when manual intervention is required.
B. Divide the automation steps into smaller tasks.
C. Use a script to automate the creation of the deployment pipeline in Google Cloud Deploy.
D. Add more engineers to finish the manual steps.
E. Automate promotion approvals from the development environment to the test environment.
You are developing reusable infrastructure as code modules. Each module contains integration tests that launch the module in a test project. You are using GitHub for source control. You need to continuously test your feature branch and ensure that all code is tested before changes are accepted. You need to implement a solution to automate the integration tests. What should you do?
A. Use a Jenkins server for CI/CD pipelines. Periodically run all tests in the feature branch.
B. Ask the pull request reviewers to run the integration tests before approving the code.
C. Use Cloud Build to run the tests. Trigger all tests to run after a pull request is merged.
D. Use Cloud Build to run tests in a specific folder. Trigger Cloud Build for every GitHub pull request.
Free Access Full Google Professional Cloud DevOps Engineer Practice Exam Free
Looking for additional practice? Click here to access a full set of Google Professional Cloud DevOps Engineer practice exam free questions and continue building your skills across all exam domains.
Our question sets are updated regularly to ensure they stay aligned with the latest exam objectives—so be sure to visit often!
Good luck with your Google Professional Cloud DevOps Engineer certification journey!