This essay focuses on Compose and create schema. Understand and design computation logic and routines in Python. ● Assess use of Python only and Python
Objectives: ● Understand dataset with data scientist mind-set. ● Understand and design computation logic and routines in Python. ● Assess use of Python only and Python data structures to perform extract, load, and transformation operations. ● Assess the design and use of database SQL and methods to perform extract, load, transformation and calculation operations. ● Structure code in appropriate methods (functions), looping and conditions. (a) From tweets.json,
find and apply all entity types using Python code. For example, hashtag is one entity type. (2 marks) (b) Create the tweets schema and store tweets in tweets.json file to a SQLite database. The tweets schema contains the following fields: id, created_at, full_text, favorite_count and retweet_count. The field id is the primary key. (5 marks) (c) Compose and create schema and store data. (i) Create the entities schema with the following fields and requirements: id: primary key tweet_id: foreign key, which links to the id field of the tweets table type: possible values found in Question 1(a) value: contains entity text values, which appear in a tweet’s full_text start index and end index: stored the values
Databases and spreadsheets (such as Microsoft Excel) are both convenient ways to store information. The primary differences between the two are;
Spreadsheets were originally design for one user, and their characteristics reflect that. They’re great for a single user or small number of users who don’t need to do a lot of incredibly complicated data manipulation. Databases, on the other hand, are designed to hold much larger collections of organized information