 课程大纲:
        
    课程大纲:         为电信服务供应商的智能大数据信息业务培训
Breakdown of topics on daily basis: (Each session is 2 hours)
Day-1: Session -1: Business Overview of Why Big Data Business Intelligence in Telco.
        Case Studies from T-Mobile, Verizon etc.
        Big Data adaptation rate in North American Telco & and how they are aligning their future business model and operation around Big Data BI
        Broad Scale Application Area
        Network and Service management
        Customer Churn Management
        Data Integration & Dashboard visualization
        Fraud management
        Business Rule generation
        Customer profiling
        Localized Ad pushing
        Day-1: Session-2 : Introduction of Big Data-1
        Main characteristics of Big Data-volume, variety, velocity and veracity. MPP architecture for volume.
        Data Warehouses – static schema, slowly evolving dataset
        MPP Databases like Greenplum, Exadata, Teradata, Netezza, Vertica etc.
        Hadoop Based Solutions – no conditions on structure of dataset.
        Typical pattern : HDFS, MapReduce (crunch), retrieve from HDFS
        Batch- suited for analytical/non-interactive
        Volume : CEP streaming data
        Typical choices – CEP products (e.g. Infostreams, Apama, MarkLogic etc)
        Less production ready – Storm/S4
        NoSQL Databases – (columnar and key-value): Best suited as analytical adjunct to data warehouse/database
        Day-1 : Session -3 : Introduction to Big Data-2
        NoSQL solutions
KV Store - Keyspace, Flare, SchemaFree, RAMCloud, Oracle NoSQL Database (OnDB)
        KV Store - Dynamo, Voldemort, Dynomite, SubRecord, Mo8onDb, DovetailDB
        KV Store (Hierarchical) - GT.m, Cache
        KV Store (Ordered) - TokyoTyrant, Lightcloud, NMDB, Luxio, MemcacheDB, Actord
        KV Cache - Memcached, Repcached, Coherence, Infinispan, EXtremeScale, JBossCache, Velocity, Terracoqua
        Tuple Store - Gigaspaces, Coord, Apache River
        Object Database - ZopeDB, DB40, Shoal
        Document Store - CouchDB, Cloudant, Couchbase, MongoDB, Jackrabbit, XML-Databases, ThruDB, CloudKit, Prsevere, Riak-Basho, Scalaris
        Wide Columnar Store - BigTable, HBase, Apache Cassandra, Hypertable, KAI, OpenNeptune, Qbase, KDI
        Varieties of Data: Introduction to Data Cleaning issue in Big Data
        RDBMS – static structure/schema, doesn’t promote agile, exploratory environment.
        NoSQL – semi structured, enough structure to store data without exact schema before storing data
        Data cleaning issues
        Day-1 : Session-4 : Big Data Introduction-3 : Hadoop
        When to select Hadoop?
        STRUCTURED - Enterprise data warehouses/databases can store massive data (at a cost) but impose structure (not good for active exploration)
        SEMI STRUCTURED data – tough to do with traditional solutions (DW/DB)
        Warehousing data = HUGE effort and static even after implementation
        For variety & volume of data, crunched on commodity hardware – HADOOP
        Commodity H/W needed to create a Hadoop Cluster
        Introduction to Map Reduce /HDFS
        MapReduce – distribute computing over multiple servers
        HDFS – make data available locally for the computing process (with redundancy)
        Data – can be unstructured/schema-less (unlike RDBMS)
        Developer responsibility to make sense of data
        Programming MapReduce = working with Java (pros/cons), manually loading data into HDFS
        Day-2: Session-1.1: Spark : In Memory distributed database
        What is “In memory” processing?
        Spark SQL
        Spark SDK
        Spark API
        RDD
        Spark Lib
        Hanna
        How to migrate an existing Hadoop system to Spark
        Day-2 Session -1.2: Storm -Real time processing in Big Data
        Streams
        Sprouts
        Bolts
        Topologies
        Day-2: Session-2: Big Data Management System
        Moving parts, compute nodes start/fail :ZooKeeper - For configuration/coordination/naming services
        Complex pipeline/workflow: Oozie – manage workflow, dependencies, daisy chain
        Deploy, configure, cluster management, upgrade etc (sys admin) :Ambari
        In Cloud : Whirr
        Evolving Big Data platform tools for tracking
        ETL layer application issues
        Day-2: Session-3: Predictive analytics in Business Intelligence -1: Fundamental Techniques & Machine learning based BI :
        Introduction to Machine learning
        Learning classification techniques
        Bayesian Prediction-preparing training file
        Markov random field
        Supervised and unsupervised learning
        Feature extraction
        Support Vector Machine
        Neural Network
        Reinforcement learning
        Big Data large variable problem -Random forest (RF)
        Representation learning
        Deep learning
        Big Data Automation problem – Multi-model ensemble RF
        Automation through Soft10-M
        LDA and topic modeling
        Agile learning
        Agent based learning- Example from Telco operation
        Distributed learning –Example from Telco operation
        Introduction to Open source Tools for predictive analytics : R, Rapidminer, Mahut
        More scalable Analytic-Apache Hama, Spark and CMU Graph lab
        Day-2: Session-4 Predictive analytics eco-system-2: Common predictive analytic problems in Telecom
        Insight analytic
        Visualization analytic
        Structured predictive analytic
        Unstructured predictive analytic
        Customer profiling
        Recommendation Engine
        Pattern detection
        Rule/Scenario discovery –failure, fraud, optimization
        Root cause discovery
        Sentiment analysis
        CRM analytic
        Network analytic
        Text Analytics
        Technology assisted review
        Fraud analytic
        Real Time Analytic
        Day-3 : Sesion-1 : Network Operation analytic- root cause analysis of network failures, service interruption from meta data, IPDR and CRM:
        CPU Usage
        Memory Usage
        QoS Queue Usage
        Device Temperature
        Interface Error
        IoS versions
        Routing Events
        Latency variations
        Syslog analytics
        Packet Loss
        Load simulation
        Topology inference
        Performance Threshold
        Device Traps
        IPDR ( IP detailed record) collection and processing
        Use of IPDR data for Subscriber Bandwidth consumption, Network interface utilization, modem status and diagnostic
        HFC information
        Day-3: Session-2: Tools for Network service failure analysis:
        Network Summary Dashboard: monitor overall network deployments and track your organization's key performance indicators
        Peak Period Analysis Dashboard: understand the application and subscriber trends driving peak utilization, with location-specific granularity
        Routing Efficiency Dashboard: control network costs and build business cases for capital projects with a complete understanding of interconnect and transit relationships
        Real-Time Entertainment Dashboard: access metrics that matter, including video views, duration, and video quality of experience (QoE)
        IPv6 Transition Dashboard: investigate the ongoing adoption of IPv6 on your network and gain insight into the applications and devices driving trends
        Case-Study-1: The Alcatel-Lucent Big Network Analytics (BNA) Data Miner
        Multi-dimensional mobile intelligence (m.IQ6)
        Day-3 : Session 3: Big Data BI for Marketing/Sales –Understanding sales/marketing from Sales data: ( All of them will be shown with a live predictive analytic demo )
        To identify highest velocity clients
        To identify clients for a given products
        To identify right set of products for a client ( Recommendation Engine)
        Market segmentation technique
        Cross-Sale and upsale technique
        Client segmentation technique
        Sales revenue forecasting technique
        Day-3: Session 4: BI needed for Telco CFO office:
        Overview of Business Analytics works needed in a CFO office
        Risk analysis on new investment
        Revenue, profit forecasting
        New client acquisition forecasting
        Loss forecasting
        Fraud analytic on finances ( details next session )
        Day-4 : Session-1: Fraud prevention BI from Big Data in Telco-Fraud analytic:
        Bandwidth leakage / Bandwidth fraud
        Vendor fraud/over charging for projects
        Customer refund/claims frauds
        Travel reimbursement frauds
        Day-4 : Session-2: From Churning Prediction to Churn Prevention:
        3 Types of Churn : Active/Deliberate , Rotational/Incidental, Passive Involuntary
        3 classification of churned customers: Total, Hidden, Partial
        Understanding CRM variables for churn
        Customer behavior data collection
        Customer perception data collection
        Customer demographics data collection
        Cleaning CRM Data
        Unstructured CRM data ( customer call, tickets, emails) and their conversion to structured data for Churn analysis
        Social Media CRM-new way to extract customer satisfaction index
        Case Study-1 : T-Mobile USA: Churn Reduction by 50%
        Day-4 : Session-3: How to use predictive analysis for root cause analysis of customer dis-satisfaction :
        Case Study -1 : Linking dissatisfaction to issues – Accounting, Engineering failures like service interruption, poor bandwidth service
        Case Study-2: Big Data QA dashboard to track customer satisfaction index from various parameters such as call escalations, criticality of issues, pending service interruption events etc.
        Day-4: Session-4: Big Data Dashboard for quick accessibility of diverse data and display :
        Integration of existing application platform with Big Data Dashboard
        Big Data management
        Case Study of Big Data Dashboard: Tableau and Pentaho
        Use Big Data app to push location based Advertisement
        Tracking system and management
        Day-5 : Session-1: How to justify Big Data BI implementation within an organization:
        Defining ROI for Big Data implementation
        Case studies for saving Analyst Time for collection and preparation of Data –increase in productivity gain
        Case studies of revenue gain from customer churn
        Revenue gain from location based and other targeted Ad
        An integrated spreadsheet approach to calculate approx. expense vs. Revenue gain/savings from Big Data implementation.
        Day-5 : Session-2: Step by Step procedure to replace legacy data system to Big Data System:
        Understanding practical Big Data Migration Roadmap
        What are the important information needed before architecting a Big Data implementation
        What are the different ways of calculating volume, velocity, variety and veracity of data
        How to estimate data growth
        Case studies in 2 Telco
        Day-5: Session 3 & 4: Review of Big Data Vendors and review of their products. Q/A session:
        AccentureAlcatel-Lucent
        Amazon –A9
        APTEAN (Formerly CDC Software)
        Cisco Systems
        Cloudera
        Dell
        EMC
        GoodData Corporation
        Guavus
        Hitachi Data Systems
        Hortonworks
        Huawei
        HP
        IBM
        Informatica
        Intel
        Jaspersoft
        Microsoft
        MongoDB (Formerly 10Gen)
        MU Sigma
        Netapp
        Opera Solutions
        Oracle
        Pentaho
        Platfora
        Qliktech
        Quantum
        Rackspace
        Revolution Analytics
        Salesforce
        SAP
        SAS Institute
        Sisense
        Software AG/Terracotta
        Soft10 Automation
        Splunk
        Sqrrl
        Supermicro
        Tableau Software
        Teradata
        Think Big Analytics
        Tidemark Systems
        VMware (Part of EMC)
 
     
     
         
     加入高级会员获得助教答疑
 加入高级会员获得助教答疑 
                