Una vez al mes, recibir nuevas ideas, tendencias, información de análisis y conocimiento de macrodatos.
AVAILABLE NEWSLETTERS:
¡Gracias por suscribirse!
Una vez al mes, recibir nuevas ideas, tendencias, información de análisis y conocimiento de macrodatos.
¡Gracias por suscribirse!
The concept of a modern data architecture has evolved dramatically over the past 10-plus years. Turn the clock back and recall the days of legacy data architectures, which had many constraints. Storage was expensive and had associated hardware costs. Compute often involved appliances and more hardware investments. Networks were expensive, deployments were only on-premises and proprietary software and hardware were locking in enterprises everywhere you turned.
This was (and for many organizations still is) a world of transactional silos where the architecture only allowed for post-transactional analytics of highly structured data. The weaknesses in these legacy architectures were exposed with the advent of new data types such as mobile and sensors, and new analytics such as machine learning and data science. Couple that with the advent of cloud computing and you have a perfect storm.
A multitude of interconnected factors disrupted that legacy data architecture era. Storage became cheaper and new software such as Apache Hadoop took center stage. Compute also went the software route and we saw the start of edge computing. Networks became ubiquitous and provided the planet with 3G/4G/LTE connectivity, deployments started to become hybrid and enterprises embraced open source software. This led to a rush of innovation as customer requirements changed, influencing the direction that vendors had to take to modernize the data architecture.
The emergence of cloud created the need to evolve again to take advantage of its unique characteristics such as de-coupled storage and compute. As a result, this led to connected data architectures, with the Hadoop ecosystem evolving for IaaS and PaaS models and innovations such as Hortonworks DataPlane Service (DPS) for connecting deployments in the data center and the public cloud.
Given that data has “mass” and is responsible for the rapid rise of cloud adoption, the data architecture must evolve again to meet the needs of today’s enterprises and take advantages of the unique benefits of cloud. So much more is required in a data architecture today to achieve our dreams of digital transformation, real-time analytics and artificial intelligence – just to name a few. This paves the way for pre-transaction analysis and drives use cases such as 360-degree view of the customer. Organizations need a unified hybrid architecture for on-premises, multi-cloud and edge environments. The time has come to once again reimagine the data architecture, with hybrid as a key requirement.
What does it take to be hybrid? We’ve been innovating to answer this question for some time. Hybrid requires:
The last point on consistent architectures is critical – not just from a technology standpoint, but more because the differences manifest themselves in a fundamental manner in the interaction model for the user vis-a-vis the technology. As an example, when it comes to the Hadoop ecosystem today, users walk up to a shared, multi-tenant cluster and just submit their SQL queries, Spark applications, etc. In the cloud, however, users have to provision their workloads such as query instances, Spark clusters, etc., before they can run analytics.
Today, we are excited to announce the Open Hybrid Architecture initiative – the last mile of our endeavor to deliver on the promise of hybrid. This initiative is a broad effort across the open-source communities, the partner ecosystem and Hortonworks platforms to enable a consistent experience by bringing the cloud architecture on-premises for the enterprise.
Another key benefit is helping customers settle on a consistent architecture and interaction model which allows them to seamlessly move data and workloads across on-premises and multiple clouds using platforms such as DPS.
Through the initiative, we deliver an architecture where it absolutely will not matter where your data is – in any cloud, on-prem or the edge – enterprises can leverage open-source analytics in a secure and governed manner. The benefits of ensuring a consistent interaction model cannot be overstated, and provides the key to unlocking a seamless experience.
The Open Hybrid Architecture initiative will make this possible by:
After careful consideration, we’ve determined the best path forward is a phased approach, similar to how Hortonworks delivered enterprise-grade SQL queries-on-Hadoop via the Stinger and Stinger.Next initiatives.
The Open Hybrid Architecture initiative will include the following development phases:
Just as we enabled the modern data architecture with HDP and YARN back in the day, we’re at it again – but this time it’s bringing the innovation we’ve done in the cloud down to our products in the data center.
Hortonworks has been on a multi-year journey toward cloud-first and cloud-native architectures. The Open Hybrid Architecture initiative is the final piece of the puzzle. Not only will this initiative bring cloud-native to the data center, but it will also help our customers embrace and master the unified hybrid architectural model that is required to get the full benefits of on-premises, cloud and edge computing. We, along with our partner ecosystem and the open-source community, are excited to tackle this next redesign of the modern data architecture.
Este sitio web utiliza cookies para análisis, personalización y publicidad. Para obtener más información o cambiar su configuración de cookies, lea nuestra Política de cookies. Al continuar navegando, acepta nuestro uso de cookies.
Apache, Hadoop, Falcon, Atlas, Tez, Sqoop, Flume, Kafka, Pig, Hive, HBase, Accumulo, Storm, Solr, Spark, Ranger, Knox, Ambari, ZooKeeper, Oozie, Phoenix, NiFi, Nifi Registry, HAWQ, Zeppelin, Slider, Mahout, MapReduce, HDFS, YARN, Metron and the Hadoop elephant and Apache project logos are either registered trademarks or trademarks of the Apache Software Foundation in the United States or other countries.
© 2011-2018 Hortonworks Inc. Todos los derechos reservados.
Comments
Interesting !
BlueData already delivers the consistent hybrid architecture shown in the diagram above. We can spin up multiple HDP clusters for different use cases and connect to existing HDFS data lake or S3
Hi Arun,
A very nice article indeed. With respect to Phase 3 above , would you have some sample deployment architectures where HDP would have been installed on top of Kubernetes ? Or any best practices which you may share around this please ?
Thanks,
Debu
Late 2018 and legacy Hadoop vendors are just now thinking about first class containerization support. How are you still in business?
The merger only bought you time. Declining revenue growth… Market is clearly migrating elsewhere. What an awesome opportunity for those who actually do a little research before finding themselves stuck in the Stone age with HDP or cloudera