Consult our trainings :
> formation Technologies numériques > formation Big Data, Intelligence Artificielle > formation Big Data, NoSQL > formation Big Data, état de l'art
Big Data, état de l'art Training
Séminaire
Best
Duration : 2 days
Ref : BGA
Price 2021 :
2030 €
excl. taxes
Free breaks and lunches
- Program
- Participants / Prerequisite
- Intra/Tailored
Program
Instructional goals
Learning objectives
À l’issue de la formation, le participant sera en mesure de :- Identify economic issues
- Assess the pros and cons of Big Data
- Understand the most important problems and potential solutions
- Identify the main methods and scopes of Big Data
- Participants
PROGRAM
» Introduction
- The origins of Big Data: A world of digital data, e-Health, timeline.
- The four-V's definition: Origins of the data.
- A breakthrough: Changes in quantity, quality, and habits.
- The value of data: A change in importance.
- Data as a raw material.
- The fourth paradigm of scientific discovery.
» Big Data: Processing, from acquisition to result
- The sequence of operations. Acquisition.
- Data collection: crawling, scraping.
- Managing event flows (Complex Event Processing, CEP).
- Indexing incoming flows.
- Integration with old data.
- Data quality: A fifth V?
- Different types of processing: Searching, learning (machine learning, transactional learning, data mining).
- Other sequencing models: Amazon, e-Health.
- One or more data repositories? From Hadoop to the in-memory.
- From tonal analysis to knowledge discovery.
» Relationships between the Cloud and Big Data
- The architecture model of public and private Clouds.
- XaaS services.
- The goals and benefits of Cloud architectures.
- Infrastructure.
- Similarities and differences between the Cloud and Big Data.
- Storage clouds.
- Classification, security, and privacy of data.
- Structure as a classification criterion: Unstructured, structured, semi-structured.
- Classification by life cycle: Temporary or permanent data, active archives.
- Security difficulties: Increased volumes, distribution.
- Potential solutions.
» Introduction to Open Data
- Philosophy of open data and goals.
- Releasing public data.
- Implementation difficulties.
- Essential features of open data.
- Areas involved. Expected benefits.
» Hardware for storage architectures
- Servers, disks, network, and use of SSD drives, importance of network infrastructure.
- Cloud architectures and more traditional architectures.
- Benefits and difficulties.
- The TCO. Power consumption: Servers (IPNM), drives (MAID).
- Object storage: principle and benefits.
- Object storage compared to traditional NAS and SAN storage.
- Software architecture.
- Storage management location levels.
- Software Defined Storage.
- Centralized architecture (Hadoop File System).
- Peer-to-peer and hybrid architectures.
- Interfaces and connectors: S3, CDMI, FUSE, etc.
- Future of other storage types (NAS, SAN) relative to object storage.
» Data protection
- Preservation over time in the face of increased volumes.
- Online or local backups?
- Traditional archiving and active archiving.
- Links with storage hierarchy management: Future of magnetic tape.
- Multisite replication.
- Damage to storage media.
» Processing methods and scope
- Classification of analysis methods based on data volume and processing power.
- Hadoop: The Map Reduce processing model.
- The Hadoop ecosystem: Hive, Pig. The difficulties of Hadoop.
- Openstack and the Ceph data manager.
- Complex Event Processing: An example? Storm.
- From BI to Big Data.
- Return to decisional and transactional models: NoSQL databases. Types and examples.
- Data ingestion and indexing. Two examples: splunk and Logstash.
- Open source crawlers.
- Search and analysis: Elasticsearch.
- Learning: Mahout. In-memory.
- Visualization: Real-time or not, in the Cloud (Bime), comparison of Qlikview, Tibco Spotfire, and Tableau.
- A general architecture of data mining via Big Data.
» Usage case through examples and conclusion
- Anticipation: Needs of users within companies, equipment maintenance.
- Security: People, fraud detection (mail, taxes), the network.
- Recommendation. Marketing analysis and impact analyses.
- Path analyses. Distribution of video content.
- Big Data for the automotive industry? For the oil industry?
- Should you begin a Big Data project?
- What future is there for data?
- Governance of data storage: Roles and recommendations, data scientists, skills involved in a Big Data project.
Participants / Prerequisite
» Participants
Prerequisites
» Prerequisite
TRAINING PROGRAM
Intra/Tailored
Book your place
Click on a session for reserving.
Time schedule
Generally, courses take place from 9:00 to 12:30 and from 14:00 to 17:30.
However, on the first day attendees are welcomed from 8:45, and there is a presentation of the session between 9:15 and 9:30.
The course itself begins at 9:30. For the 4- or 5-day hands-on courses, the sessions finish at 15:30 on the last day



Vidéo de présentation





