Data Preprocessing

Due to their usually huge size (often several gigabytes or more) and their probable origin from various, heterogeneous sources, the real-world databases of today are highly susceptible to noisy, incomplete, and unreliable information.


Getting to Know Your Data

You’re going to want to understand the following: what kinds of characteristics or areas make up your data? What are the values of each attribute? What are the discrete characteristics and which are continuously valued? What is the look of the data? How are the values distributed? Can we visualize the data in order to get a better sense of it all? Can we find outliers? Can we evaluate the resemblance between some data objects and others? The subsequent assessment will assist to gain such insight into the data. Knowledge of your data is helpful for pre-processing data, the first important task of data analysis.


Introduction to Data Science

  1. Why Data Science?
  2. What is Data Science?
  3. What is Data Science process?
  4. What Kinds of Data Can Be Analyzed?
  5. What Kinds of Patterns Can Be Analyzed?
  6. Which Technologies are Used?
  7. Which Kinds of Applications Are Targeted?
  8. Major issues in Data Science
  9. Data Science and Society

Why Data Science?

We live in a world where a vast amount of data are collected daily. It is a significant necessity to analyze such data to discover knowledge from it.

We live in the information age

It is a popular saying, but in fact, we live in the information age. Every day, terabytes or petabytes of data flow into our computer networks, the World Wide Web (WWW), and various data storage devices from the company, society, science and engineering, medicine, and almost every other aspect of everyday life. Powerful and versatile tools are badly required to automatically discover and convert precious information from enormous quantities of data into structured knowledge.


How to think like a computer scientist?

What is science? What is computer science anyway? Computer science is the study of computers. It is an art of science that representing and processing information. How about the definition of a computer scientist? Each person has their own perspective and way of looking at things about the definition itself. It is more related to the concept of theoretical, engineering and solution provided on a specific domain of computer science. Most of the time, computer scientist field conquered by male gender instead of a female because of the two reasons.


Learn Python for Data Science

Why Python?

To help many others learn python faster, I decided to create this tutorial. We will take bite-sized information on how to use Python for Data Science in this tutorial, practice it until we are comfortable and use it for our own purpose. Why learn Python for data science? Recently, Python has gained a lot of interest as a language choice for data science because of extensive support libraries from the communities, integration feature and improved programmer’s productivity. However, there are several limitations such as difficulty in using other languages (not many similarities such as semicolons or declaring cast type), weak in mobile computing, gets slow in speed (compiler), run-time errors (strict design restrictions) and lack database access layers.


Pokemon TCG price tracker using Python

If you want to check price down on some websites, you can use this script. I’m going to track the cost of Pokemon TCG (Reshiram & Charizard GX— 217/214— Hyper Rare Card) on this demo’s Troll & Toad page and send me an email automatically if any price drops above $50. You can use this script on all various websites.