Innovation & Technology Senior Data Scientist

  • Competitive
  • New York, NY, États-Unis
  • CDI, Plein-temps
  • PWC - US
  • 13 déc. 17 2017-12-13

Innovation & Technology Senior Data Scientist

PwC/LOS Overview
PwC is a network of firms committed to delivering quality in assurance, tax and advisory services.

We help resolve complex issues for our clients and identify opportunities. Learn more about us at www.pwc.com/us.

At PwC, we develop leaders at all levels. The distinctive leadership framework we call the PwC Professional (http://pwc.to/pwcpro) provides our people with a road map to grow their skills and build their careers. Our approach to ongoing development shapes employees into leaders, no matter the role or job title.

Are you ready to build a career in a rapidly changing world? Developing as a PwC Professional means that you will be ready
- to create and capture opportunities to advance your career and fulfill your potential. To learn more, visit us at www.pwc.com/careers.

What will you do if you work in Assurance at PwC?
You'll ask questions and test assumptions. You'll help determine if companies are reporting information that investors and others can rely on. You'll help businesses solve complex issues faced by management and boards. You'll serve the public interest and the capital markets by conducting quality audits. Visit http://pwc.to/pwcassurance for more information on PwC's Assurance practice.

The world is quickly changing, that's why PwC is quickly adapting. We're capitalizing on trends that will impact corporate reporting.

Our focus is on globalization, technology, sustainability and environmental reporting, population shifts and regulation. We combine skills and experience to help our clients address their challenges.

Job Description
The Assurance Innovation group is developing capabilities leveraging the latest in Open Source technologies to automate and accelerate our client engagements across the enterprise. We are focused on incorporating the latest in machine learning, Big Data, NoSQL, cutting edge development languages,

and advanced data processing techniques to include structured and unstructured information in a loosely coupled ecosystem delivering a technology platform that positions PwC for the future.

The Data Science team is transforming the way PwC does business. The team helps develop cutting edge data analytic and automation applications using the latest technologies on both big and small but complex data sources. The models will be seen by senior partners within the firm and will help shape how the firm goes to market.

The Data Science team works on the cutting edge of technologies in an applied context to develop these systems.

This team will build systems which are immediately operational and which can deliver their outputs in a web based or visualization tool context.

The models will be part of a larger system which provides data and visualization support.

Position/Program Requirements
Minimum Year(s) of Experience: 5 in data analytics, development and programming.

Minimum Degree Required: Bachelor's degree or 7 or more years in data analytics, development and programming and experience leading teams.

Knowledge Preferred:

Demonstrates extensive knowledge and/or a proven record of success in applied subject matter
such as IT, finance, accounting, energy or health care role emphasizing data analytics for a global network of professional services firms, including the following areas:

- Understanding of NoSQL (Graph, Document, Columar) database models, XML, relational and other database models and associated SQL;

- Understanding of ETL tools and techniques, such as tools like Talend,

Mapforce, how to map transformation and flow of data from a source to
a target system;

- Performing in development language environments
e.g. Python, Java or equivalent
and applying analytical methods to large and complex datasets leveraging one of those languages;

- Automating complex processes; and,

- Proven ability in data analytics management.

Skills Preferred:

Demonstrates extensive abilities and/or a proven record of success in the application of statistical or numerical methods, data mining or data-driven problem solving, including the following areas:

- Utilizing and applying into projects knowledge of Python based data science tools such as Pandas and Numpy;

- Utilizing programming skills and knowledge on how to write models which can be directly used in production as part of a large scale system;

- Utilizing and applying into projects knowledge of data wrangling techniques and scripting languages with proven ability in working on a cloud based infrastructure environment;

- Understanding of not only how to develop data science analytic models but how to operationalize these models so they can run in an automated context;

- Understanding of machine learning algorithms, such as k-NN, GBM, Neural Networks Naive Bayes, SVM, and Decision Forests;

- Utilizing and applying into projects knowledge of technologies such as H20.ai, Google Machine Learning and Deep learning;

- Proven ability with NLP and text based extraction techniques;

- Working independently with minimal guidance;

- Leveraging problem solving and troubleshooting skills with proven ability exercising mature judgment;

- Prioritizing effectively workload to meet deadlines and work objectives;

- Writing clearly and succinctly in a manner that appeals to a wide audience;

- Communicating complex engineering concepts succinctly to senior non-technical and technical decision makers;

- Performing development, data analytics and programming/scripting, especially Python, Java, Scala, C++, R,

SQL, etc.;

- Applying techniques such as multivariate regressions, Bayesian probabilities, clustering algorithms, machine learning, dynamic programming, stochastic-processes, queuing theory, algorithmic knowledge to efficiently research and solve complex development problems and application of engineering methods to define, predict and evaluate the results obtained;

- Visualizing and communicating analytical results, ideally using open source visualization technologies such as HTML, JavaScript, and related packages such as D3;

- Using large data sets, along with analytical scripting tools and visualization platforms to produce actionable insights for clients; data cleansing, transformation, and modeling in order to produce a clear story that is easily comprehended by non-technical audiences;

- Leading, training and working with other data scientists in designing effective analytical approaches taking into consideration performance and scalability to large datasets; and,

- Performing unit and system testing to validate the output of the analytic procedures against expected results.