External reviews
External reviews are not included in the AWS star rating for the product.
Centralized Governance through Unity Catalog.
What do you like best about the product?
My comments on the Lakehouse are specific to Unity Catalog (UC):
Governance is all about being a " benevolent bad cop" to the enterprise audiences! That message , up until now(i.e advent of UC), was mostly /only possible via a 'stale Power Point' and , after the Governance teams enforce compliance standards , possibly due to an adverse event of data breach. WHat I have been able to 'show-and-tell' via live DBX UC demo's to the largest healthcare provider enterprise users has captured the rapt attention of the folks! That is my experience. Now coming to the features that UC offers - OKTA Inegration to rope in the Identities of any IAM system over to UC, APIs to setup ACCESS GRANTS & SCHEMA OBJECTS creation, Security via RLS/CLM, and above all, I feel, the cross-workspace access setup to ensure LOBs/Teams with Data Assets across several Catalogs, goes a long way to ensure seamless & ubiqutous data sharing.
The featuers allow for Power Users who are skilled in ANSI SQL to execute their querries across the three namespace architectures (catalog.schema.tables) once the cross WS access is setup. Now coming to the ML Model building Data Scientists and Citizen Data Scientist, the centralized storing of the Model Experiment with its features can be registered in Unity Catalog to ensure Centralized governance of the ensuring endpoints that enable Model Serving.
The Future release of ABACS (as opposed to RBACs) could deliver compute/cluster economies of scale/scope from a cost perspective while making Sensitive Data MAsking and Tagging at a DDL level seamless.
Another eagerly anticipated feature would be autmated sensitive data identification & tagging via the OKERA Integration of all "DBx registered Data Assets in DBx Catalogs".
The use of Service PRinciples as identities opens the scope to intelligently manage /address the limitation of the number of AD groups /Global Groups that can be created.
These are my current observations.
Governance is all about being a " benevolent bad cop" to the enterprise audiences! That message , up until now(i.e advent of UC), was mostly /only possible via a 'stale Power Point' and , after the Governance teams enforce compliance standards , possibly due to an adverse event of data breach. WHat I have been able to 'show-and-tell' via live DBX UC demo's to the largest healthcare provider enterprise users has captured the rapt attention of the folks! That is my experience. Now coming to the features that UC offers - OKTA Inegration to rope in the Identities of any IAM system over to UC, APIs to setup ACCESS GRANTS & SCHEMA OBJECTS creation, Security via RLS/CLM, and above all, I feel, the cross-workspace access setup to ensure LOBs/Teams with Data Assets across several Catalogs, goes a long way to ensure seamless & ubiqutous data sharing.
The featuers allow for Power Users who are skilled in ANSI SQL to execute their querries across the three namespace architectures (catalog.schema.tables) once the cross WS access is setup. Now coming to the ML Model building Data Scientists and Citizen Data Scientist, the centralized storing of the Model Experiment with its features can be registered in Unity Catalog to ensure Centralized governance of the ensuring endpoints that enable Model Serving.
The Future release of ABACS (as opposed to RBACs) could deliver compute/cluster economies of scale/scope from a cost perspective while making Sensitive Data MAsking and Tagging at a DDL level seamless.
Another eagerly anticipated feature would be autmated sensitive data identification & tagging via the OKERA Integration of all "DBx registered Data Assets in DBx Catalogs".
The use of Service PRinciples as identities opens the scope to intelligently manage /address the limitation of the number of AD groups /Global Groups that can be created.
These are my current observations.
What do you dislike about the product?
Not a "poke in the eye" of the hard working Solutions Enginners who face us the clients, music , but ....
1. The Product Engg teams appear to lack digesting the Governance Narratives that enterprises expect , out of the box, not wait for a product release.
2. The fact that Spark engine centric DBx compoutes/workspaces will see a heavy legacy SQL code with all its fun (hard coding, nest sub-querries, temp tables use, CTAS et al....) , the product engg teams appear to not hav such folks at " Product Desgin" phase. Ditto, moresoever, for point #1
3. The publicly available documentation pertaining to features appears to be stale when compared with the features being released.
4. The commitment to deliver a features (example ABACS) on the set date, has spanned several quarters over close to two years! When you promise solving world hunger and keep moving the goal post , credibility is impaired.
1. The Product Engg teams appear to lack digesting the Governance Narratives that enterprises expect , out of the box, not wait for a product release.
2. The fact that Spark engine centric DBx compoutes/workspaces will see a heavy legacy SQL code with all its fun (hard coding, nest sub-querries, temp tables use, CTAS et al....) , the product engg teams appear to not hav such folks at " Product Desgin" phase. Ditto, moresoever, for point #1
3. The publicly available documentation pertaining to features appears to be stale when compared with the features being released.
4. The commitment to deliver a features (example ABACS) on the set date, has spanned several quarters over close to two years! When you promise solving world hunger and keep moving the goal post , credibility is impaired.
What problems is the product solving and how is that benefiting you?
Hey, how come your smart alecs did not realize that we use Dbx for "Data Governance ". List that also!!
- Leave a Comment |
- Mark review as helpful
Databricks provides seamless faster data processing for our customers.
What do you like best about the product?
Unity Catalog, Delta Live Tables, Lakehouse solutions
What do you dislike about the product?
Nothing as such I observed so far, All the features are awesome.
What problems is the product solving and how is that benefiting you?
Enterprise Lakehouse, Delta Live Tables
Best product for both datalake and data warehouse reduce the cost and faster deliver the data
What do you like best about the product?
Best product for both datalake and data warehouse
cost reduce
cost reduce
What do you dislike about the product?
logging is not good
integration to visual is bit complex
integration to visual is bit complex
What problems is the product solving and how is that benefiting you?
data distribution on big data
It's too good for large amount of data processing
What do you like best about the product?
It's data processing velocity, storage distribution and compabilty and kind of data
What do you dislike about the product?
It's overall good. Clusters cost I think it's high
What problems is the product solving and how is that benefiting you?
It's helpful me to build optimised and efficient ETL logical for our business cases,And also for Data analysis, Data validation, Data processing,
It is very useful platform and user friendly platform
What do you like best about the product?
It is very useful platform and user friendly platform
What do you dislike about the product?
It is very useful platform and user friendly platform and it is very easy to implement. There are lots of features.
What problems is the product solving and how is that benefiting you?
It's very useful for ETL platform
Unlocking the Power of Data: A Deep Dive into Databricks Lakehouse Platform
What do you like best about the product?
Unified platform for Data & AI with workspace for data engineering, data science & SQL analysis
Autoloader with Schema evolution make easier for incremental de-duplicated feeds
Delta Live Table pipelines works both batch/stream pipelines helps to build the serverless lakehouse without much capacity planning
Data Quality expectation & observability are in-built in DLT pipelines and ready to use
Unity Catalog solves the data silos problems by providing the fine grained access control & unified governance
Good Databricks community support exists for Databricks partners & databricks customers
Ease of use to build the metadata driven frameworks
Good integration with lot of tools for different segments using partner connect
Azure Repos is another cool feature
Autoloader with Schema evolution make easier for incremental de-duplicated feeds
Delta Live Table pipelines works both batch/stream pipelines helps to build the serverless lakehouse without much capacity planning
Data Quality expectation & observability are in-built in DLT pipelines and ready to use
Unity Catalog solves the data silos problems by providing the fine grained access control & unified governance
Good Databricks community support exists for Databricks partners & databricks customers
Ease of use to build the metadata driven frameworks
Good integration with lot of tools for different segments using partner connect
Azure Repos is another cool feature
What do you dislike about the product?
Managing Cost is complicated when comes to non-DLT based pipleines in terms of capapcity planning of teh clusters, but can be solved through DLT & SQL warehouse endpoints
Vendor-lock in terms of using DELTA format with DLT based pipelines but soon this fomrats will be supported in otehr platforms
Databricks is not GUI based drag-drp ETL framework tool, learning curve in terms of spark, scal, python or SQL programming language is required
Vendor-lock in terms of using DELTA format with DLT based pipelines but soon this fomrats will be supported in otehr platforms
Databricks is not GUI based drag-drp ETL framework tool, learning curve in terms of spark, scal, python or SQL programming language is required
What problems is the product solving and how is that benefiting you?
Unifying the batch & streaming data ino single paltform
Bringing the data lake & data warehouse together with the data lakehouse platform
Data colloboration, Data federation, data mesh can be achieved through unity catalog unified governance
Performing the CDC was earlier complicated in data lake but now that is solved through autoloader with change data feed & DLT for real time changes
Multi-cloud - no cloud provider lock-in for compute resourcing (control plane & data plane is seperated)
Numerous integrations are possible with easy connectivity options
Bringing the data lake & data warehouse together with the data lakehouse platform
Data colloboration, Data federation, data mesh can be achieved through unity catalog unified governance
Performing the CDC was earlier complicated in data lake but now that is solved through autoloader with change data feed & DLT for real time changes
Multi-cloud - no cloud provider lock-in for compute resourcing (control plane & data plane is seperated)
Numerous integrations are possible with easy connectivity options
Best tool available in market for data management
What do you like best about the product?
It combines the best elements of data lakes and data warehouses. Storage formats are standard and it provides API to access the data directly. Real-time reports are also very good feature that eliminates the need for separate systems to serving real-time data applications.
What do you dislike about the product?
It can be difficult for new users to get hands on. UI can be improvise to benefit new user to navigate easily.
What problems is the product solving and how is that benefiting you?
Combines the best of both Lake House and databricks.
APIs to work with third party tools.
Data sharing with external user/customers.
APIs to work with third party tools.
Data sharing with external user/customers.
Lake House platform review
What do you like best about the product?
It's ease of use and optimisation databricks offers for big data test cases
What do you dislike about the product?
Some times it's slow and initial learning should be encouraged
What problems is the product solving and how is that benefiting you?
Solved customer feedback through audio video then fed the data to Lakehouse platform to generate insights for a multinational company
Unified analytics platform
What do you like best about the product?
ACID tansaction support to delta lake.
it is a platform for both data engineering and data science.
flexibility with different tpe of data.
scalability and performance.
Integration with cloud services
collabration features
warehousing for real time or nearly real time data
it is a platform for both data engineering and data science.
flexibility with different tpe of data.
scalability and performance.
Integration with cloud services
collabration features
warehousing for real time or nearly real time data
What do you dislike about the product?
Cost will be the primary concern, that can make serveral firms to go for other optins.
maintenance for the complex deployments
maintenance for the complex deployments
What problems is the product solving and how is that benefiting you?
Impoved performance and scalability in processing the data.
Easier adoption of advance analytics tools.
Helps in Real Time Data processing.
Adopting ACID properties into lakehouse helps in data quality and readability.
Easier adoption of advance analytics tools.
Helps in Real Time Data processing.
Adopting ACID properties into lakehouse helps in data quality and readability.
A Robust Solution for Big Data Management and Analytics
What do you like best about the product?
I really appreciate how Databricks Lakehouse Platform merges the strengths of data lakes and data warehouses, creating a unified space for handling data tasks. It’s like having the best of both worlds which simplifies data management and analytics significantly. The collaborative notebooks are a cherry on top, fostering teamwork and making iterative development smooth among our data teams. Plus, the high-speed analytics capabilities ensure that data queries and sharing are a breeze, which is crucial for our day-to-day decision-making processes.
What do you dislike about the product?
The initial hill to climb in terms of learning can be a bit steep, especially if you're new to big data platforms. It felt like a slow start, but once past that hurdle, things started to click. The cost factor can be a bit of a pinch, especially for smaller setups or projects on a tight budget. While there's a decent amount of guidance out there, the documentation can sometimes leave you wanting more, especially when you hit complex or unique challenges that require a deeper dive to navigate through.
What problems is the product solving and how is that benefiting you?
Databricks Lakehouse Platform is tackling the headache of juggling between data lakes and data warehouses. It's kind of bundled the two into one neat package, making data management and analytics way less complicated. This fusion is cutting down on the tech pile-up in our projects big time, making it simpler to switch between data engineering and data science chores. When it comes to number crunching, the platform's speedy analytics is making queries quick and data sharing reliable, which is big for our decision-making. The collaborative notebooks are a cool feature too, they're encouraging teamwork and making back-and-forth on development ideas smoother among our data teams. There’s a bit to chew on initially learning-wise and the pricing can sting a little, but the payback in simpler data operations and better teamwork is solid.
showing 151 - 160