Question 1

I am using Orange (in Python) for some data mining tasks. More specifically, for clustering. Although I have gone through the tutorial and read most of the documentation, I still have a problem. All the examples in docs and tutorials assume that I have a tab delimited table with data in it. However, there is nothing saying how one can go about creating a new table from scratch. For example, I want to create a table for word frequencies across different documents.

Maybe I am missing something so if anyone has any insight it'd be appreciated.

Thanks George

EDIT:

This is how I create my table

#First construct the domain object (top row)
vars = []
for var in variables:vars.append(Orange.data.variable.Continuous(str(var)))
domain = Orange.data.Domain(vars, classed) #The second argument indicated that the last attr must not be a class    
#Add data rows assuming we have a matrix 
t = Orange.data.Table(domain, matrix)

Question 2

This took me hours to figure out. In python, do this:

Import Orange
List, Of, Column, Variables = [Orange.feature.Discrete(x) for x in ['What','Theyre','Called','AsStrings']]
Domain = Orange.data.Domain([List, Of, Column, Variables])
Table = Orange.data.Table(Domain)
Table.save('NewTable.tab')

I'd tell you what each bit of code does, but as of now I'm not really sure. It's funny that such a powerful toolkit should have such hard to understand documentation, but I suspect it's because it's entire user base has doctorates.

How do I create a new data table in Orange?

EDIT:

Related Q&A

How to turn off MySQL query cache while using SQLAlchemy?

Storing an inverted index

How to determine whether java is installed on a system through python?

How should I save the model of PyTorch if I want it loadable by OpenCV dnn module

Apache Spark ALS - how to perform Live Recommendations / fold-in anonym user

python JIRA connection with proxy

How can I iterate over only the first variable of a tuple

Bottle with Gunicorn

Run several python programs at the same time

Using python, what is the most accurate way to auto determine a users current timezone