ORSYS formation
CONTACT - +352 26 49 79 1204
CONTACT - 📞 +352 26 49 79 1204    drapeau francais   drapeau anglais

Consult our trainings :

Big Data, état de l'art Training

Séminaire
Best
Duration : 2 days
Ref : BGA
Price  2021 : 2030 € excl. taxes
Free breaks and lunches
  • Program
  • Participants / Prerequisite
  • Intra/Tailored
Program

Instructional goals

Learning objectives

À l’issue de la formation, le participant sera en mesure de :
  • Identify economic issues
  • Assess the pros and cons of Big Data
  • Understand the most important problems and potential solutions
  • Identify the main methods and scopes of Big Data
  • Participants
PROGRAM

» Introduction

  • The origins of Big Data: A world of digital data, e-Health, timeline.
  • The four-V's definition: Origins of the data.
  • A breakthrough: Changes in quantity, quality, and habits.
  • The value of data: A change in importance.
  • Data as a raw material.
  • The fourth paradigm of scientific discovery.

» Big Data: Processing, from acquisition to result

  • The sequence of operations. Acquisition.
  • Data collection: crawling, scraping.
  • Managing event flows (Complex Event Processing, CEP).
  • Indexing incoming flows.
  • Integration with old data.
  • Data quality: A fifth V?
  • Different types of processing: Searching, learning (machine learning, transactional learning, data mining).
  • Other sequencing models: Amazon, e-Health.
  • One or more data repositories? From Hadoop to the in-memory.
  • From tonal analysis to knowledge discovery.

» Relationships between the Cloud and Big Data

  • The architecture model of public and private Clouds.
  • XaaS services.
  • The goals and benefits of Cloud architectures.
  • Infrastructure.
  • Similarities and differences between the Cloud and Big Data.
  • Storage clouds.
  • Classification, security, and privacy of data.
  • Structure as a classification criterion: Unstructured, structured, semi-structured.
  • Classification by life cycle: Temporary or permanent data, active archives.
  • Security difficulties: Increased volumes, distribution.
  • Potential solutions.

» Introduction to Open Data

  • Philosophy of open data and goals.
  • Releasing public data.
  • Implementation difficulties.
  • Essential features of open data.
  • Areas involved. Expected benefits.

» Hardware for storage architectures

  • Servers, disks, network, and use of SSD drives, importance of network infrastructure.
  • Cloud architectures and more traditional architectures.
  • Benefits and difficulties.
  • The TCO. Power consumption: Servers (IPNM), drives (MAID).
  • Object storage: principle and benefits.
  • Object storage compared to traditional NAS and SAN storage.
  • Software architecture.
  • Storage management location levels.
  • Software Defined Storage.
  • Centralized architecture (Hadoop File System).
  • Peer-to-peer and hybrid architectures.
  • Interfaces and connectors: S3, CDMI, FUSE, etc.
  • Future of other storage types (NAS, SAN) relative to object storage.

» Data protection

  • Preservation over time in the face of increased volumes.
  • Online or local backups?
  • Traditional archiving and active archiving.
  • Links with storage hierarchy management: Future of magnetic tape.
  • Multisite replication.
  • Damage to storage media.

» Processing methods and scope

  • Classification of analysis methods based on data volume and processing power.
  • Hadoop: The Map Reduce processing model.
  • The Hadoop ecosystem: Hive, Pig. The difficulties of Hadoop.
  • Openstack and the Ceph data manager.
  • Complex Event Processing: An example? Storm.
  • From BI to Big Data.
  • Return to decisional and transactional models: NoSQL databases. Types and examples.
  • Data ingestion and indexing. Two examples: splunk and Logstash.
  • Open source crawlers.
  • Search and analysis: Elasticsearch.
  • Learning: Mahout. In-memory.
  • Visualization: Real-time or not, in the Cloud (Bime), comparison of Qlikview, Tibco Spotfire, and Tableau.
  • A general architecture of data mining via Big Data.

» Usage case through examples and conclusion

  • Anticipation: Needs of users within companies, equipment maintenance.
  • Security: People, fraud detection (mail, taxes), the network.
  • Recommendation. Marketing analysis and impact analyses.
  • Path analyses. Distribution of video content.
  • Big Data for the automotive industry? For the oil industry?
  • Should you begin a Big Data project?
  • What future is there for data?
  • Governance of data storage: Roles and recommendations, data scientists, skills involved in a Big Data project.
Participants / Prerequisite

» Participants

Prerequisites

» Prerequisite

TRAINING PROGRAM
Intra/Tailored

Contact Informations

By checking this box, I certify that I have read and accepted the conditions for the use of my data regarding the General Data Protection Regulation (GDPR).
You can at any time modify the use of your data and exercise your rights by sending an email to rgpd@orsys.fr
By checking this box, I agree to receive commercial and promotional communications from ORSYS Training*. You can unsubscribe at any time by using the link included in our communications.
CLASSE A DISTANCE

En inter et en intra-entreprise
Inscrivez-vous ou contactez-nous !

Book your place

Click on a session for reserving.

Time schedule

Generally, courses take place from 9:00 to 12:30 and from 14:00 to 17:30.
However, on the first day attendees are welcomed from 8:45, and there is a presentation of the session between 9:15 and 9:30.
The course itself begins at 9:30. For the 4- or 5-day hands-on courses, the sessions finish at 15:30 on the last day
linkedin orsys
twitter orsys
it! orsys
instagram orsys
pinterest orsys
facebook orsys
youtube orsys